The CAHPS Health Plan (HP CAHPS) Survey is a survey that asks health plan enrollees to report about their care and health plan experiences as well as the quality of care received from physicians. HP CAHPS Version 4.0 was endorsed by NQF in July 2007, and Version 5.0 received maintenance endorsement in January 2015 and was last endorsed in Spring 2019 (CBE #0006). The 5.1 version of the CAHPS Health Plan Survey, released in the fall of 2020, explicitly asks about respondents’ experiences with care received in person, by phone, and by video to account for changes in care due to the pandemic. The survey is part of the CAHPS family of patient experience surveys and is available in the public domain at https://www.ahrq.gov/cahps/surveys-guidance/hp/index.html.
The Adult CAHPS Health Plan Survey is designed to be administered to includes individuals 18 years and older who have been enrolled in a health plan and have received care for a specified period (6 months or longer for Medicaid version, 12 months or longer for Commercial version) with no more than one 30-day break in enrollment. The CAHPS Adult Health Plan Survey has 39 items. Ten (10) of the survey items are used to form 4 composite measures. The survey also has 4 single-item rating measures.
The Child CAHPS Health Plan Survey is designed to be administered to parents or guardians of children aged 0-17 who have been enrolled in a health plan and have received care for a specified period (6 months or longer for Medicaid version, 12 months or longer for Commercial version) with no more than one 30-day break in enrollment. The CAHPS Child Health Plan Survey has 41 items. Eleven (11) of the survey items are used to form 4 composite measures. The survey also has 4 single-item rating measures.
The composite measures are:
- Getting Needed Care
- Getting Care Quickly
- How Well Doctors Communicate
- Health Plan Customer Service
The survey also has 4 single-item rating measures:
- Rating of Personal Doctor
- Rating of Specialist
- Rating of Health Care
- Rating of Health Plan
The only difference between the Medicaid and commercial versions of the CAHPS Health Plan Survey is the reference period: 6 months for Medicaid enrollees and 12 months for commercial enrollees.
A guidance document is available on the AHRQ CAHPS website (https://www.ahrq.gov/cahps/surveys-guidance/hp/index.html) which explains how to field the CAHPS Health Plan Survey and gather the data needed for analysis and reporting. It provides instructions and advice related to the following topics: constructing the sampling frame, choosing the sample, maintaining confidentiality, collecting the data, tracking returned questionnaires, and calculating the response rate.
The Child HP CAHPS Survey Rating of Specialist measure assesses the enrollee’s overall rating of their child’s specialist they talked to most often in the last 6 months from 0 to 10, with 0 being the worst and 10 being the best.
Measure Specs
General Information
The CAHPS Health Plan (HP CAHPS) Survey assesses aspects of health care delivery that are important to patients and for which patients are the best or only source of information (Cleary, Edgman-Levitan, 1997; Cleary, 2016; Solomon et al., 2005). Further, the HP CAHPS Survey focuses on patient-centered care, which is a key element of health care quality (IOM, 2001). A focus on the patient experience has the potential to enhance clinical outcomes, improve patient safety, and reduce unnecessary medical services. Moreover, assessing patient experience through surveys that include data on the demographic characteristics of respondents, such as race and ethnicity, can help identify the extent to which positive experiences are distributed equitably across patients (Haviland et al., 2003). Use of this measure will benefit both patients and health plans:
- Patients can use information from the measures to help make more informed choices about which health plan to use.
- Health plans and their providers can use data from the surveys for quality improvement initiatives and incentives.
- Researchers can use data files from the surveys to help answer important health services research questions.
Patient experience encompasses the range of interactions that patients have with the healthcare system. The terms patient satisfaction and patient experience are often used interchangeably, but they are not the same. CAHPS surveys ask patients to report on what they experienced in a healthcare encounter—for example, whether something happened or how often it happened. Patient experience of care surveys provide actionable, objective information for quality improvement. Patient satisfaction surveys, on the other hand, use ratings to measure whether a patient’s expectations about a health encounter were met.
The HP CAHPS Survey is a standardized survey instrument for measuring enrollees’ perspectives on their care. The survey is generally administered annually to patients who have received care in the last 6 months (12 months for Commercial).
References
Cleary, PD, Edgman-Levitan, S. (1997). Health care quality. Incorporating consumer perspectives. JAMA. 278(19), 1608-12.
Cleary, PD. (2016). Evolving concepts of patient-centered care and the assessment of patient care experiences; optimism and opposition. J Health Pol, Policy & Law, 41 (4), 675-696.
Haviland, M. et al. (2003). Do health care ratings differ by race or ethnicity? Joint Commission Journal on Quality and Safety. 29(3), 134-145.
Institute of Medicine. (2001). Crossing the Quality Chasm: A New Health System for the 21st Century. Accessible at https://nap.nationalacademies.org/catalog/10027/crossing-the-quality-ch….
Solomon, L., Hays, RD., Zaslavsky, A., & Cleary, PD. (2005). Psychometric properties of the Group-Level Consumer Assessment of Health Plans Study (CAHPS) instrument. Medical Care, 43, 53-60.
The Child HP CAHPS Survey Rating of Specialist captures the overall experience with specialists seen. Specialists provide targeted care for specific health issues. This rating reflects the perceived quality of care by the specialist. Improving performance on this measure promotes better coordination and trust in specialty care, leading to more accurate diagnoses, timely interventions, and improved treatment outcomes. Enhanced specialist care can reduce unnecessary procedures and hospitalizations, lowering healthcare costs while increasing satisfaction.
The CAHPS Health Plan Survey (HP CAHPS) Database is a central repository of survey data from State Medicaid agencies, State Children's Health Insurance Programs (CHIP), and individual health plans that have administered the HP CAHPS Survey and chose to submit their data to the Database. The 2024 HP CAHPS Database included 69,505 Adult Medicaid respondents from 233 health plans and 111,833 Child Medicaid respondents from 234 health plans.
Numerator
The response options for this measure range from 0 to 10, where higher scores indicate more positive ratings. AHRQ calculates the score for this item using a top box scoring method. For the top box or “top proportion” score, the numerator is the number of respondents who answered “9” or “10.”
The rating question is as follows:
- Q23: We want to know your rating of the specialist your child talked to most often in the last [6 months]. Using any number from 0 to 10, where 0 is the worst specialist possible and 10 is the best specialist possible, what number would you use to rate that specialist?
No additional detail, refer to 1.14.
Denominator
The measure’s denominator is the number of respondents to the survey item. The target population for the survey consists of patients who were enrolled in their health plan for at least 6 months. This time frame is also known as the look back period. The sampling frame is a person-level list and not a visit-level list.
This question was only asked of respondents who indicated they made an appointment for the child with a specialist and their child has talked to at least one specialist (i.e., responded yes to the question “In the last 6 months, did you make any appointments for your child with a specialist?” and responded with at least 1 specialist to “How many specialists has your child talked to in the last 6 months?”).
Exclusions
Individuals are excluded from the denominator if:
- They were not continuously enrolled in the health plan (excepting an allowable enrollment lapse of less than 30 days).
- Their primary health coverage was not through the plan.
- Another member of his or her household had already been sampled.
- They had been institutionalized (put in the care of a specialized institution) or are deceased.
- Survey respondents who did not answer at least one item of a measure are excluded from a measure’s denominator.
- Some users also exclude a survey from scoring and analysis if someone else answered the questions (as a proxy) for the respondent.
The denominator is the total number of surveys fielded minus the total number of ineligible surveys. The total number of ineligible surveys includes sample cases assigned deemed ineligible: does not meet the eligible population criteria (refer to Section 1.15b). No other cases are excluded from the denominator, but cases are excluded from the denominator of the measure if they did not answer any item within the measure.
Measure Calculation
Respondents report on their experiences accessing and using care, and interacting with their health plans, over the past 6 months (Medicaid) or 12 months (Commercial Health Plans).
AHRQ calculates HP CAHPS Survey measure scores using a top box scoring method.
Composite Measures:
There are two basic steps to calculating a composite measure score for a health plan:
- Calculate the proportion of responses in the top box or most positive response category for each question in a composite measure.
- Calculate the mean or average top box scores across all questions in a composite measure to determine the composite measure's top box score.
For the top box or “top proportion” score, the numerator is the number of respondents who answered that they “Always” received the desired care or service for a given measure. For example, if 400 out of 1,000 total respondents answered “Always” to a composite measure item, the top box score for that item would be 40 percent [i.e., (400 ÷ 1,000)*100 = 40%].
Lower proportion and middle proportion composite measure scores can also be calculated following the same methodology where the lower proportion is the proportion answering “Never” or “Sometimes” and the middle proportion is the proportion answering “Usually”.
Rating Items:
For the rating items, the numerator for the top box score is the number of respondents who responded 9 or 10 on the 0-10 scale (where 10 is the “Best” and 0 is the “Worst”). For example, if 600 out of 1,000 total respondents answered “9” or “10” to a rating item, the top box score for that item would be 60 percent [i.e., (600 ÷ 1,000)*100 = 60%].
Lower proportion and middle proportion rating scores can also be calculated where the lower proportion is the proportion answering 0-6 on the 0-10 scale and the middle proportion is the proportion answering 7 or 8.
Users may also choose to calculate mean scores or linearized mean scores.
Note the survey includes screener items to identify respondents who meet the target process for each measure, such as whether the individual sought any medical care, saw a personal doctor, saw a specialist, or interacted with the health plan’s customer service. Measures are only calculated using respondents who experienced a particular service/process.
Users can also case-mix adjust the results for characteristics such as respondent age, education, general health status, and mental health status. The CAHPS Analysis Program—often referred to as the CAHPS Macro—is a free program written in SAS (version 6.0 or later) that enables survey users to case-mix adjust their data. The program also generates a distribution of survey results for each of the measures, calculates the mean score for both individual survey items and composite measures, and indicates whether an entity’s scores are statistically different from the average. The results presented in these analyses are based on unadjusted top box scores unless otherwise noted.
More information about the calculation of proportion scores and mean scores can be found in these documents:
- Instructions for Preparing Data for Analysis: https://www.ahrq.gov/sites/default/files/wysiwyg/cahps/surveys-guidance…
- How Results are Calculated: https://www.ahrq.gov/sites/default/files/wysiwyg/cahps/cahps-database/2…
- Instructions for Analyzing Data from CAHPS Survey: https://www.ahrq.gov/sites/default/files/wysiwyg/cahps/surveys-guidance…;
The measure is not stratified.
Users should choose a data collection protocol that maximizes the survey response rate at an acceptable cost. Some sponsors, as well as researchers conducting field tests, have found that the mail with telephone follow-up method is most effective or email with mail or telephone follow-up.
AHRQ provides protocols for collecting responses though users can adapt it to meet their needs. The protocols include mail only, telephone only, mail with phone follow-up, or email (web) with mail or phone follow-up. AHRQ provides detailed instructions for these different protocols in the “Fielding the CAHPS Health Plan Survey” document survey available on the AHRQ CAHPS website: https://www.ahrq.gov/cahps/surveys-guidance/hp/index.html in the “Guidance for using the CAHPS Health Plan Survey” zip file.
There is no minimum response rate requirement on the HP CAHPS Survey. The CAHPS consortium has found that higher response rates are achievable if users take steps to ensure the accuracy of the sample frame and carefully follow the recommended data collection protocol, including one or more attempts to follow up with non-respondents.
In its simplest form, the response rate is the total number of completed questionnaires divided by the total number of individuals selected for the sample. Calculating the response rate is helpful in determining a more accurate starting sample size for future survey administration. For the CAHPS Health Plan Survey, the goal is a response rate of at least 40 percent for Medicaid plans (and/or 300 completed surveys) and 50 percent for commercial plans.
To calculate the response rate, use the following formula: Number of completed returned questionnaires divided by the total number of respondents selected minus the sum of deceased + ineligibles.
AHRQ makes the HP CAHPS Survey available in English and Spanish.
The sample design is based on the units for which users want to compare results, such as health insurance plans or products within health plans. For the purposes of this discussion, “Health insurance plan” is the entity that offers the health insurance (e.g., Plan A), and the “product” is the specific benefit plan design or coverage offered by the plan (e.g., Plan A’s HMO product). Users draw a sample for each health insurance plan or product about which they want to make inferences, separating plans into products, or other groups, such as if there are differences in geography, provider networks, or administrative structure.
The sample that a vendor selects to survey should be drawn from a list of individuals (adults aged 18 and older, or children 17 and younger) covered by the plan or product. This list, which typically would be provided by the sponsor, is the sample frame.
Defining the Sample Frame: Eligibility Guidelines
Below are the CAHPS guidelines for determining who to include in the sample frame for the commercial survey (Medicaid survey):
- If surveying adults, include all individuals 18 years or older who have been enrolled in a health plan or product for 6 (12) months or longer, with no more than one 30-day break in enrollment during the 6 (12) months.
- If surveying children, include all individuals 17 years or younger who have been enrolled in a health plan or product for 12 (6) months or longer, with no more than one 30-day break in enrollment during the 12 (6) months.
- To identify those who have been enrolled in the plan or product for 12 (6) months or longer, use the anticipated start date of data collection to determine whether the person meets the 12 (6)-month eligibility requirement. For example, if the anticipated start date is March 1, 2026, include all those who have been continuously enrolled since March 1, 2026 (September 1, 2026).
- Allow the sample frame to include multiple individuals from the same household, but the sample drawn should not have more than one person (adult or child) per household. The final sample must contain only one respondent per household. Where a duplicate household is sampled, it is discarded and replaced by another random draw from the frame.
- Include individuals with primary health coverage through the plan. Do not include individuals with only other types of coverage, like a dental-only plan.
- In the case of individuals who switch (or children who are switched) from one product to another within the same plan during the continuous enrollment period, count them as enrolled in the product in which they were enrolled the longest. For example, in the last 6 months, if the individual who was enrolled in a health plan’s HMO product for 4 months switched to the same health plan’s POS product, consider that person continuously enrolled in the health plan’s HMO product.
- All CAHPS survey items have been designed for the general population. Appropriate screening items are included for items targeted to assess a specific experience. In order to ensure that results are comparable to those produced by other sponsors and vendors, targeted sampling, such as selecting only patients with particular conditions or experiences, is not recommended. Targeted sampling should only be used to supplement the general population sample, if desired (e.g., adding sample to target children with chronic conditions).
The following section explains how to calculate the appropriate sample size for the HP CAHPS Survey. The instructions are the same for both the Adult and Child versions as well as the Commercial and Medicaid versions.
Calculating the Sample Size for the Adult (Child) Questionnaire
It is recommended that the user select enough individuals to obtain approximately 300 completed adult (child) questionnaires per plan/product. For example, for an anticipated response rate of 50 percent, the user would need to start with a minimum sample size of 600.
If users anticipate that poor contact information (addresses and telephone numbers) will decrease the number of questionnaires that reach the sampled individuals, a larger sample may be needed.
If one or more of the plans do not have a membership large enough to draw the required sample size, the sample will be everyone in the health plan enrollee population who meets all of the eligibility criteria. Even under these circumstances, the sample may include only one adult (child) per household.
Sampling information is provided to users as part of the “Fielding the CAHPS Health Plan” survey document available in the “Guidance for using the CAHPS Health Plan Survey” zip file: https://www.ahrq.gov/sites/default/files/wysiwyg/cahps/surveys-guidance…
Data are not reported for any item or measure with fewer than 20 valid responses and health plans with fewer than 20 responses were not included. AHRQ recommends that there needs to be approximately 300 completed questionnaires per plan/product to have a sufficient number of responses for results to be statistically reliable.
Proxy Respondents
The HP CAHPS Survey Plan does allow for proxy respondents for mail and web-based mode. At the end of the survey, there is an item that asks “Did someone help you complete this survey?” If the answer is Yes, the follow-up question is “How did that person help you?” and they are to mark one or more of these response items:
1. Read the questions to me
2. Wrote down the answers I gave
3. Answered the questions for me
4. Translated the questions into my language
5. Helped in some other way
However, these the last two questions of the core questionnaire are not included in telephone scripts because telephone interviews should not be conducted with proxy respondents.
Supplemental Attachment
Point of Contact
CAHPS® is a registered trademark of the U.S. Department of Health and Human Services and managed by AHRQ.
Karen Chaves
Rockville, MD
United States
Naomi Yount
Westat
Rockville, MD
United States
Importance
Evidence
The HP CAHPS Survey measures key components of patient experience, such as how well doctors communicate and getting needed care, that are consistent with patient-centered care. The CAHPS Surveys focus on aspects of care that consumers have identified as important and for which patients are the best or only source of information. Measuring patients’ perceptions of their healthcare experience is not just a means to improve services—it’s a recognition that the patient’s voice matters in and of itself. Listening to patients affirms their role as active participants in their care, and their insights are essential to truly understanding the quality and impact of healthcare delivery. In 2024, over 200,000 health plan enrollees throughout the country completed the HP CAHPS Survey for their health plan and their health plan submitted this data to the AHRQ CAHPS Database. Since submission to the AHRQ CAHPS Database is not mandatory, it is likely that far more health plans are administering and using these data. Public reporting of these survey results creates incentives for health plans and state agencies to improve their quality of care, directly impacting the patients who receive it. Because of this, it is important to ensure that the survey aligns with what patients believe constitutes high-quality care. We reviewed the literature on the determinants of patient care experiences measured by CAHPS and their associations with other indicators of health care quality. CAHPS is also an actionable measure that helps health plans target interventions that will improve the quality and patient-centeredness of care.
Review of the Evidence
Prior research has identified several features of healthcare delivery structure, including plan characteristics and market-level characteristics that are associated with patient experiences. Three major systematic reviews have examined the relationships among patient experience, clinical processes, and patient outcomes. A systematic review performed by researchers in the U.K. found that patient experience is favorably associated with adherence to recommended medications and treatments, preventive care such as screenings and immunizations, patient-reported health outcomes, clinical outcomes, reduced hospitalizations and primary care visits, and reduced adverse events (Doyle et al., 2013). Anhang Price et al. (2014) reviewed evidence on the association between patient experiences and other measures of health care quality in the U.S. They similarly found that better patient care experiences are associated with higher levels of adherence to recommended prevention and treatment processes, better clinical outcomes, and less health care utilization. Navarro et al. (2021) reviewed 9 studies and found that ratings of patient experience is related to the overall rating of health care and can influence clinical and quality outcomes.
Structure
Health plan type and market characteristics have been found to predict patient experience in several studies. Among managed care organizations (MCOs), for example, Medicaid enrollees had significantly less favorable CAHPS scores than commercial plan enrollees (Elliott, Farley, et al., 2005). For-profit and nationally affiliated health plans tended to receive worse patient experience scores, particularly on overall ratings of the health plan and composite measures on health plan customer service and access (Landon et al., 2021). The financial strength of health plans, as measured by their fiscal margins, was associated with more favorable CAHPS scores (Beauvais et al., 2007). Market-level factors such as HMO competition and penetration did not appear to affect patient experience (Scanlon, Swaminathan, et al., 2008), but Medicare beneficiaries in “higher-intensity” healthcare markets reported more problems getting care quickly than in markets with less healthcare consumption (Mittler, Landon, et al., 2010). For health plans, prior work highlights multi-level impacts on patient experience, which can occur at the system, care site, or physician level – with physicians accounting for the largest proportion of explainable variance (Rodriguez, Scoggins, at al., 2009). Improving the infrastructure supporting certain aspects of care may have broad effects because system changes can influence multiple outcomes (Cleary, 2016).
Quality Improvement / Interventions
Researchers have used the CAHPS survey to learn if accountable care organization (ACO) incentives to limit health care use and improve quality may enhance or hurt patients’ experiences with care. More specifically, using CAHPS survey data covering 3 years before and 1 year after the start of Medicare ACO contracts in 2012 as well as linked Medicare claims, McWilliams et al. (2014) compared patients’ experiences in a group of 32,334 fee-for-service beneficiaries attributed to ACOs (ACO group) with those in a group of 251,593 beneficiaries attributed to other providers (control group), before and after the start of ACO contracts. They found that, in the first year, ACO contracts were associated with meaningful improvements in some measures of patients’ experience and with unchanged performance in others. Lastly, studies have found that hospitals with more positive perceptions of patient safety culture tend to have more positive CAHPS scores from their patients (Sorra et al., 2012; Abrahamson et al., 2016). This finding suggests that improvements in patient safety culture may lead to improved patient experience with care.
Outcomes
Out of 40 evidence papers with outcome measures, Doyle’s (2013) meta- analysis found 29 studies that reported positive associations between patient experience and clinical outcomes, 11 with no associations, and none with negative associations. The lack of more evidence may be due to associations between a patient’s illness level, their level of care, and their likelihood for a poor outcome such as mortality, morbidity, or a readmission. Often, such associations have more than one plausible direction of causality. For example, clinicians may be especially attentive to the needs of sicker patients (Kahn et al., 2007) and patients near the end of life (Elliott, Haviland, et al., 2013).
Moreover, substantial evidence points to a positive association between various components of patient experience, such as good communication between clinicians and patients, and several important processes and outcomes. These include lower utilization of unnecessary healthcare services; better patient adherence to medical advice; better process of care measures for acute myocardial infarction (AMI), congestive heart failure, pneumonia and surgery; lower inpatient mortality among acute myocardial infarction (AMI) patients; lower infection rates (Anhang Price et al., 2014); and better clinician and staff perceptions of patient safety culture (Sorra et al., 2012).
Schneider and colleagues (2001) found that two HP CAHPS Survey composites were associated with several Healthcare Effectiveness Data and Information Set (HEDIS) clinical process measures among Medicare health plan enrollees. They found that experiences obtaining needed care and getting information and customer service from health plans were associated with mammography, eye examinations for diabetics, receipt of beta-blockers following myocardial infarction, LDL cholesterol testing following an acute cardiovascular event, and follow-up within 30 days following a hospitalization for mental illness (Schneider et al., 2001).
Utilization
Research suggests an association between better patient experiences and lower healthcare utilization. Platonova and Carnes (2023) found that better results on the measure Provider Communication composite measure led to 19% fewer ER visits for Medicaid patients in North Carolina. The items within the composite measure had strong relationships with ER visits, with provider treating the patient with respect associated with 37% fewer ER visits.
In another study, children with asthma were less likely to visit the emergency department, make urgent office visits, or be hospitalized if their physicians had reviewed a long-term therapeutic plan with their parents (Clark, Cabana, et al., 2008). Among African Americans with Type 2 diabetes, those who reported that doctors or nurses usually listened carefully or spent enough time with them were significantly less likely to visit the emergency department in the 12 months following completion of a patient experience survey (Gary, Maiese, et al., 2005). Fenton et al. found that patients who rated their providers most highly had lower odds of visiting the emergency department but higher odds of being admitted to the hospital the following year (Fenton, Jerant, et al., 2012). Children whose parents report longer waits for primary care visits were more likely to visit the emergency department for non-urgent reasons than those who report waiting for less time (Brousseau, Bergholte, et al., 2004).
Health-related Patient Behavior and Disease Management
One composite of the HP CAHPS survey assesses patients’ perceptions of how well providers communicate with them. Better patient-provider communication promotes healthcare-related patient behaviors (Fuertes, Boylan, et al., 2009). A 2009 meta-analysis of 127 studies assessing the link between patient treatment adherence and physician-patient communication found a 19% higher risk of non-adherence among patients whose physician communicated poorly (Zolnierek and Dimatteo, 2009). Doyle’s (2013) meta-analysis showed positive associations between the quality of clinician-patient communications and adherence to medical treatment in 125 of 127 studies analyzed. Studies using the CAHPS measure have found that better provider communication is positively associated with adherence to hypoglycemic medications among diabetics (Ratanawongsa, Karter, et al., 2013), adherence to tamoxifen among breast cancer patients (Liu, Malin, et al., 2013), and higher rates of colorectal cancer screening among adults in the US (Carcaise-Edinboro and Bradley, 2008).
References
Abrahamson K, Hass Z, Morgan K, Fulton B, Ramanujam R. (2016). The relationship between nurse-reported safety culture and the patient experience. J Nurs Adm. 46(12):662-668. doi: 10.1097/NNA.0000000000000423. PMID: 27851708.
Anhang Price, R, Elliott, MN, Zaslavsky, AM, Hays, RD, Lehrman, WG, Rybowski, L, Edgman-Levitan, S, Cleary, PD. (2014) Examining the role of patient experience surveys in measuring health care quality. Med Care Res Rev. 71(5), 522-54.
Beauvais, B, Wells, R., Vasey, J., and DelliFraine, J. (2007). Does money really matter? The effects of fiscal margin on quality of care in military treatment facilities. Hosp. Top. 85(3), 2-15.
Brousseau, D. C., Bergholte, J.,et al. (2004). The effect of prior interactions with a primary care provider on nonurgent pediatric emergency department use. Archives of Pediatrics & Adolescent Medicine. 158(1), 78-82.
Carcaise-Edinboro, P. and Bradley. CJ. (2008). Influence of patient-provider communication on colorectal cancer screening. Medical Care. 46(7), 738-745.
Clark, NM., Cabana, MD. et al. (2008). The clinician-patient partnership paradigm: Outcomes associated with physician communication behavior. Clinical Pediatrics. 47(1), 49-57.
Cleary, PD. (2016) Evolving concepts of patient-centered care and the assessment of patient care experiences; optimism and opposition. J Health Pol, Policy & Law. 41(4), 675-696.
Doyle, C., L. Lennox, et al. (2013). A systematic review of evidence on the links between patient experience and clinical safety and effectiveness. BMJ Open. 3(1). http://bmjopen.bmj.com/content/3/1/e001570.full
Elliott, MN., Farley, D., Hambarsoomians, K., and Hays, R.D. (2005). Do Medicaid and commercial CAHPS scores correlate within plans?: A New Jersey case study. Med Care. 43(10), 1027-1033.
Elliott, MN., Haviland, AM., et al. (2013). Care experiences of managed care Medicare enrollees near the end of life. Journal of the American Geriatrics Society 61(3), 407-412.
Fenton, JJ., Jerant, AF., et al. (2012). The cost of satisfaction: a national study of patient satisfaction, health care utilization, expenditures, and mortality. Archives of Internal Medicine. 172(5), 405-411.
Fuertes, JN., Boylan, LS., et al. (2009). Behavioral indices in medical care outcome: The working alliance, adherence, and related factors. Journal of General Internal Medicine. 24(1), 80-85.
Gary, TL., Maiese, EM., et al. (2005). Patient satisfaction, preventive services, and emergency room use among African-Americans with type 2 diabetes. Disease Management. 8(6), 361-371.
Kahn, KL., Tisnado, DM., et al. (2007). Does ambulatory process of care predict health-related quality of life outcomes? Health Services Research. 42, 63-83.
Landon, BE., Zaslavsky, AM., Beaulieu, ND., Shaul, JA., Cleary, PD. (2021). Health plan characteristics and consumers’ assessments of quality. Health Affairs, 20(2). 274-286.
Liu, Y., Malin, JL., et al. (2013). Adherence to adjuvant hormone therapy in low-income women with breast cancer: The role of provider-patient communication. Breast Cancer Research and Treatment. 137(3), 829-836.
McWilliams, JM, Landon, BE, Chernew, ME, Zaslavsky, AM. (2014). Changes in patients' experiences in Medicare Accountable Care Organizations. N Engl J Med. 371(18), 1715-24.
Mittler, J., Landon, B., Fisher, E., Cleary, P., and Zaslavsky, A. (2010). Market variations in intensity of Medicare service use and beneficiary experiences with care. Health Serv Res 45(3), 647-669.
Navarro, S., Ochoa, CY., Chan, E., Du, S., Farias, AJ. (2021). Will improvements in patient experience with care impact clinical and quality of care outcomes?: A systematic review. Medical Care. 59(9), 843-856, DOI: 10.1097/MLR.0000000000001598
Platonova, EA, Carnes, KJ. (2023). Relationship between patient-centered primary care provider communication and emergency room visits in the Medicaid population in North Carolina, United States. J Prim Care Community Health. 14, doi: 10.1177/21501319231171430
Ratanawongsa, N., Karter, AJ., et al. (2013). Communication and medication refill adherence: the Diabetes Study of Northern California. JAMA Internal Medicine. 173(3), 210-218.
Rodriguez, HP, Scoggins, JF, von Glahn, T, Zaslavsky, AM, Safran, DG. (2009) Attributing sources of variation in patients' experiences of ambulatory care. Med Care. 47(8), 835-41.
Scanlon, D., Swaminathan, S., Lee, W., and Chernew, M. (2008). Does competition improve health care quality? Health Serv Res. 43(6), 1931-1951.
Schneider, EC, Zaslavsky, AM, et al. (2001). National quality monitoring of Medicare health plans: the relationship between enrollees' reports and the quality of clinical care. Medical Care. 39(12), 1313-1325.
Sorra, J, Khanna, K, Dyer, N, Mardon, R, Famolaro, T. (2012) Exploring relationships between patient safety culture and patients’ assessments of hospital care. Journal of Patient Safety 8(3), 131–139.
Zolnierek, KB. and Dimatteo, MR. (2009). Physician communication and patient adherence to treatment: a meta-analysis. Medical care. 47(8), 826-834.
It is important that the Child HP CAHPS Survey reflects the aspects of care that parents/guardians associate with high-quality healthcare for their child. Through a literature review, focus groups, a Technical Expert Panel, and numerous other development activities, the CAHPS Consortium found that an overall rating of the specialist is an important component of care. This measure asks the respondent to rate the specialist on a scale from worst (0) to best (10). The overall rating measure captures a comprehensive view of the patient’s experience with the specialist. It integrates multiple aspects of the experience with the specialist into a single rating. The simple summary rating score can help make comparisons easier for enrollees.
Measure Impact
The family of CAHPS surveys measure aspects of patient-centered care that complement clinical process and outcome measures in consumer choice, quality improvement, public reporting, and pay-for-performance programs (Anhang Price et al, 2014). Published research indicates that individuals use information from patient experience measures to make decisions about their healthcare providers and plans. One study found that seeing publicly reported quality information was a determinant of choosing higher quality-rated health plans, although the weight given to quality information also depended on other features, such as cost and provider choice (Faber et al., 2009). A study of low-income parents in New York State found that parents choose separate CHIP managed care plans with higher CAHPS scores for their newly enrolled children (Liu et al., 2009). Additionally, a study of physician choice found that patients choosing a new primary care physician valued other patients’ reports of interpersonal quality and overall recommendations (Fanjiang et al., 2007).
Patient experiences with health plans are also linked to their persistence in the plans. For example, one study found that the mean voluntary disenrollment rate among Medicare managed care enrollees is four times higher for plans in the lowest 10 percent of overall CAHPS Health Plan survey ratings than for those in the highest 10 percent (Lied et al., 2003). At the provider level, patients who reported the poorest-quality relationships with their physicians are three times more likely to voluntarily leave the physicians’ practice than patients with the highest-quality relationships (Safran et al., 2001).
Racial and ethnic patient subgroups may value various aspects of the care experience differently. CAHPS surveys have been used to measure these differences. For example, Collins et al. (2017) found that the CAHPS domains with the most importance to respondents varied across subgroups. These researchers conclude that tailoring quality improvement programs to the factors most important to the racial, ethnic, and language mix of the patient population of the health plan may help improve quality.
References
Anhang Price, R, Elliott, MN, Zaslavsky, AM, Hays, RD, Lehrman, WG, Rybowski, L, Edgman-Levitan, S, Cleary, PD. (2014) Examining the role of patient experience surveys in measuring health care quality. Med Care Res Rev. 71 (5): 522-54.
Faber, M, Bosch, M., Wollersheim, H, Leatherman, S, and Grol, R. (2009). Public reporting in health care: how do consumers use quality-of-care information? A systematic review. Med Care. 47(1): 1-8.
Collins, RL, Haas, A, Haviland, AM, Elliott, MN. (2017). What Matters Most to Whom: Racial, Ethnic, and Language Differences in the Health Care Experiences Most Important to Patients. Med Care. 55(11): 940-947.
Fanjiang, G, von Glahn, T, Chang, H, Rogers, W, and Safran, D. (2007). Providing patients web-based data to inform physician choice: if you build it, will they come? J Gen Intern Med. 22(10): 1463-1466.
Lied, TR, Sheingold, SH, Landon, BE, Shaul, JA, Cleary, PD. (2003). Beneficiary reported experience and reported voluntary disenrollment in Medicare managed care. Health Care Finance Rev. 25(1) :55–66.
Liu, H, Phelps, C, Veazie, P, Dick, A, Klein, J, Shone, L, Noyes, K, and Szilagyi, P. (2009). Managed care quality of care and plan choice in New York SCHIP. Health Serv Res. 44(3): 843-861.
Safran, DG, Montgomery, JE, Chang, H, Murphy, J, Rogers, WH. (2001). Switching doctors: predictors of voluntary disenrollment from a primary physician’s practice. J Fam Practice. 50(2): 130–6.
The Child HP CAHPS questions focus on aspects of care for which the parent/guardian of the child is the best or only source of information and that reflect elements of care that are most meaningful. Published research indicates that individuals use information from patient experience measures to make decisions about their healthcare providers and plans. One study found that seeing publicly reported quality information was a determinant of choosing higher quality-rated health plans, although the weight given to quality information also depended on other features, such as cost and provider choice (Faber et al., 2009). A study of low-income parents in New York State found that parents choose SCHIP managed care plans with higher CAHPS scores for their newly enrolled children (Liu et al., 2009). Patient experiences with health plans are also linked to their persistence in the plans. For example, one study found that the mean voluntary disenrollment rate among Medicare managed care enrollees is four times higher for plans in the lowest 10 percent of overall CAHPS Health Plan survey ratings than for those in the highest 10 percent (Lied et al., 2003). These results may generalize to the pediatric population, since the parents/guardians are generally making the decision on which health plan to enroll for the child. t the provider level, patients who reported the poorest-quality relationships with their physicians are three times more likely to voluntarily leave the physicians’ practice than patients with the highest-quality relationships (Safran et al., 2001).
The Rating of Specialist measure is important to enrollees because it reflects their overall experience with specialists who often play a critical role in diagnosing and managing complex or chronic health conditions. Seeing a specialist is often prompted by a significant health concern, making the quality of that interaction especially impactful. This measure captures the overall experience with the specialist and helps support making decisions about with whom to seek care. It also offers enrollees a meaningful way to express concerns or positive experiences with their child’s specialists.
References
Faber, M, Bosch, M., Wollersheim, H, Leatherman, S, and Grol, R. (2009). Public reporting in health care: how do consumers use quality-of-care information? A systematic review. Med Care. 47(1), 1-8.
Lied, TR, Sheingold, SH, Landon, BE, Shaul, JA, Cleary, PD. (2003). Beneficiary reported experience and reported voluntary disenrollment in Medicare managed care. Health Care Finance Rev. 25(1), 55–66.
Liu, H, Phelps, C, Veazie, P, Dick, A, Klein, J, Shone, L, Noyes, K, and Szilagyi, P. (2009). Managed care quality of care and plan choice in New York SCHIP. Health Serv Res. 44(3), 843-861.
Safran, DG, Montgomery, JE, Chang, H, Murphy, J, Rogers, WH. (2001). Switching doctors: Predictors of voluntary disenrollment from a primary physician’s practice. J Fam Practice. 50(2), 130–6.
Performance Gap
The analyses were based on data from the 2023 and 2024 Child HP CAHPS Survey Database.
The 2024 survey was administered from July 2023 to June 2024 and includes 234 plans and 111,833 respondents.
To examine the performance gap over time, we also analyzed the 2023 survey data. The 2023 survey was administered from July 2022 to June 2023 and includes 233 plans and 103,515 respondents.
Deciles in the Performance Scores by Decile tables are based on the performance scores; and the “mean performance score” row in the tables show the average unadjusted top box scores across health plans. For cases where the performance score was tied across decile boundaries, all health plans with that score were assigned to the same decile.
As shown in Table 1 and the attached Table 2.4a.6 for the Rating of Specialist measure:
- the 2024 mean top box score = 72.3; number of measured entities = 230; and number of respondents was 27,054.
- the 2023 mean top box score = 71.0; number of measured entities = 226; and number of respondents was 23,622.
The mean top box score increased slightly (71.0% to 72.3%), where the maximum score remained the same (89% in both 2024 and 2023). The scores are lower than ideal (100% is ideal), given that about three-quarters of respondents rated their child’s specialist as a 9 or 10 on a scale of 0 to 10. These results show that while there are increases from 2023 to 2024, indicating that the performance gap may be decreasing over time, health plans need to continue to aim to improve their specialist ratings.
| Overall | Minimum | Decile_1 | Decile_2 | Decile_3 | Decile_4 | Decile_5 | Decile_6 | Decile_7 | Decile_8 | Decile_9 | Decile_10 | Maximum | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Mean Performance Score | 72.3 | 57.0 | 62.1 | 66.3 | 68.4 | 70.6 | 72.6 | 74.0 | 75.4 | 77.5 | 79.5 | 82.6 | 89.0 |
| N of Entities | 230 | 2 | 26 | 23 | 23 | 25 | 27 | 15 | 27 | 26 | 20 | 15 | 1 |
| N of Persons / Encounters / Episodes | 27,054 | 100 | 2,520 | 2,739 | 3,332 | 2,557 | 4,116 | 1,838 | 2,887 | 3,185 | 1,933 | 1,812 | 35 |
Care Gaps
Closing Care Gaps
Optional for Fall 2025.
Feasibility
Feasibility
The HP CAHPS Survey is a standardized instrument designed to assess patient experience of care. As these patient-experience data are collected from patients, the structured data are not available in electronic sources outside of the data collection by the health plan or sponsoring organization.
The data are collected through a survey instrument that is administered directly to health plan enrollees, not during care delivery. Surveys are generally mailed to the sampled enrollees, and those survey results can be entered into structured databases (e.g., Excel, SPSS, SAS). No proprietary platform is required to administer the survey. Though mixed-mode administration (i.e., mail and phone) is a viable strategy for the collection of CAHPS surveys, mail continues to be the most frequent mode for most CAHPS surveys. Users then create electronic databases of results after receipt of the completed hard copy survey through scanning or data entry. However, vendors may set up their database before data collection by populating the frame to assist in identifying nonresponse.
Traditionally, the rationale for not using electronic sources more broadly is that mail and telephone are the best ways to obtain representative samples of patients based on the contact information that is available for sampling and data collection. Web/internet has been added as a mixed mode strategy for health plans for their enrollees.
Structured or unstructured fields. All items are structured and on a 4 -item Likert-type response option scale (1-4) or for the rating items on a 0-10 scale. All responses are numeric.
Electronic feasibility. HP CAHPS Survey users can offer a web survey to respondents to complete the survey though that option should not be the only option as it may exclude enrollees who have limited or no access to the web and/or who do not have an email address to send an electronic version of the survey.
Missing data. Item level missing data is low on the HP CAHPS Survey however, some items will have fewer response than others due to gate or filter questions. For example, if a respondent has not seen a personal doctor during the reference period (e.g., 6 months), they are skipped through items about their experiences with doctors. As a result, some CAHPS Health Plan Survey items have higher percentages of missing data overall, but when skip patterns are considered, the percentages of inappropriate missing data are much lower (< 10%).
Measure susceptibility to inaccuracies and ability to audit data: The HP CAHPS Survey is self-reported perceptions or experiences with the care received and therefore cannot be assessed to determine if the results are accurate. The procedures for administering HP CAHPS Survey has been standardized by the Centers for Medicare & Medicaid Services (CMS) and NCQA for many years. Because NCQA-accredited health plans are required to submit HP CAHPS Survey results to the NCQA, those plans most often contract with an NCQA-certified survey vendor to accurately collect and report CAHPS survey results. NCQA requires strict adherence to its standardized procedures and protocols for survey administration and collection. NCQA staff monitors each survey vendor’s work and provides ongoing technical support to survey vendors. CMS contracts with vendors who are required to adhere to strict standards for survey administration and analysis. Both the NCQA and CMS place a high value on aligning requirements to assist in streamlining CAHPS measurement for Health Plans and for those plan enrollees who are being surveyed. Further, data submitted to the AHRQ Database are reviewed to ensure there are no out of range values and that skip patterns are followed.
Change to the Instrument. Since the instrument was last endorsed, there was only one change to the survey and that was to add text to the instrument to allow for respondents to include care that was done virtually (i.e., phone or by video) to account for changes in care delivery due to the COVID-19 pandemic. For example, instructions now included “by phone, or by video”: “These questions ask about your own health care from a clinic, emergency room, or doctor’s office. This includes care you got in person, by phone, or by video. Do not include care you got when you stayed overnight in a hospital. Do not include the times you went for dental care visits.” This change did not impact data structure or availability.
Data Collection Burden
For respondents: The survey takes approximately 15 minutes to complete, dependent on the individual.
For states and health plans: Survey sampling uses administrative enrollment data that is maintained by all health plans and easily accessible to produce a sampling frame. Health plans generally hire a survey vendor to administer, track, and analyze their survey data resulting in lower burden for the health plans. To help reduce burden, states can use data submitted to the AHRQ CAHPS Databases to satisfy requirements for CMS Core Set Reporting (https://www.medicaid.gov/sites/default/files/2024-05/cahpsfactsheet_1.p…).
Cost Considerations
The HP CAHPS Survey is freely available for use with no proprietary fees.
The cost to hire a vendor varies based on the size of the health plan and desired number of completed surveys. AHRQ provides guidance for hiring a vendor and resources for finding a certified vendor (https://www.ahrq.gov/cahps/surveys-guidance/helpful-resources/hiring/in…).
Impact on Clinician Workflow
The HP CAHPS Survey does not interfere with diagnostic thought processes or patient -physician interactions as the survey is retrospective after care has been given, not during the visit.
Potential Barriers and Mitigation Strategies
Achieving a desired response rate may be difficult for users. Phone is not optimal as the only mode of survey administration, but it is commonly used as a follow-up for CAHPS mail surveys. Phone follow-up can improve CAHPS response rates compared to mail-only (Burkhart et al., 2014; Fowler et al., 2002; Gallagher et al., 2005; Klein et al., 2011). A study of Medicare beneficiaries found that response rates continue to improve when up to 4 follow-up calls are made (Burkhart et al., 2014). In addition, phone follow-up calls help to achieve better representation of patients in terms of income, literacy/education, health status, age, gender, and race/ethnicity, above and beyond mail surveys alone (Tesler and Sorra, 2017). The CAHPS Consortium continues to conduct research to develop and test survey administration methods that can improve the efficiency of data collection, enhance response rates, and gather more information about the experiences of those segments of the patient population that are hard to reach through more traditional means. This research includes: 1) studies comparing the effect of administration modes on response rates, survey scores, and data collection costs (e.g., mode comparisons have included in-office distribution vs. mail; email vs. mail); 2) studies assessing the effect of survey length on response rates and survey scores; 3) studies examining the impact of incentives on response rates; and 4) studies comparing the effect of different survey formats and design on survey responses. AHRQ also provided a webinar on how to achieve higher response rates (https://www.ahrq.gov/cahps/news-and-events/events/webinar-011124.html).
Analysis and Reporting: AHRQ makes available many resources to assist with analysis and reporting. For instance, there is a free CAHPS Analysis Program which is written for SAS that enables survey users to conduct the analyses needed to produce valid comparisons of performance across similar health care organizations. Users can also review documentation on how to prepare data for analysis (https://www.ahrq.gov/sites/default/files/wysiwyg/cahps/surveys-guidance…). Further, vendors usually conduct all analyses and reports.
References
Burkhart, Q, Haviland, A, Kallaur, P, et al. (2014). How much do additional mailings and telephone calls contribute to response rates in a survey of Medicare beneficiaries. Field Methods. 27(4): 409-25.
Fowler, FJ, Gallagher, PM, Stringfellow, VL, et al. (2002). Using telephone interviews to reduce nonresponse bias to mail surveys of health plan members. Med Care. 40(3): 190-200.
Gallagher, PM, Fowler, FJ, Stringfellow, VL. (2005). The nature of nonresponse in a Medicaid survey: causes and consequences. J Off Stat. 21(1) :73-87.
Klein, DJ, Elliott, MN, Haviland, AM, et al. (2011). Understanding nonresponse to the 2007 Medicare CAHPS survey. Gerontologist. 51(6): 843-55.
Tesler, R. and Sorra, J. CAHPS Survey Administration: What We Know and Potential Research Questions. (Prepared by Westat, Rockville, MD, under Contract No. HHSA 290201300003C). Rockville, MD: Agency for Healthcare Research and Quality: October 2017. AHRQ Publication No. 18-0002-EF. Accessible at https://www.ahrq.gov/sites/default/files/wysiwyg/cahps/about-cahps/rese….
Most vendors have established methods for tracking the sample. The Consortium suggests setting up a system to track the returned surveys by the unique ID number that is assigned to each respondent in the sample. This ID number should be placed on every questionnaire that is mailed and/or on the call record of each telephone case.
To maintain respondent confidentiality, the tracking system should not contain any of the survey responses. The survey responses should be entered in a separate data file linked to the sample file by the unique ID number. (This system will generate the weekly progress reports that should be reviewed closely.) Data should be stored securely—preferably on encrypted or password-protected systems—with access limited. If paper responses are used, they should be shredded following de-identified data entry.
The HP CAHPS Survey data is therefore de-identified upon data collection with a focus on protecting the confidentiality of respondents. Vendors are trained on maintaining confidentiality and any data submitted to the AHRQ CAHPS Database is de-identified. Only the plan name is known, and that is not reported out by AHRQ. Every plan and respondent is assigned a de-identified ID. Results are only reported in aggregate form. AHRQ does not report any results if there are fewer than 10 respondents and vendors have similar or more stringent rules.
The HP CAHPS Survey has a long history of use dating back to 1997. The HP CAHPS Survey has gone through four main revisions since that time, using field and psychometric testing conducted by multiple partners, including NCQA, CMS, and other stakeholders to increase the scientific rigor and relevance of the survey and the usability of the data. All survey development has been conducted by the CAHPS Consortium, a public-private research collaborative.
Steps which have contributed to the content and design of the HP CAHPS Survey over time have included:
- Literature review and review of existing measures
- Development and consultation with technical expert panels
- Focus groups with consumers
- Cognitive testing of survey questions to ensure they will be understood by respondents
- Field testing to assess the reliability of the survey results
- Cognitive testing of measure labels to ensure that survey results are communicated clearly to providers and the public
- Public comment
- On-going collaboration and harmonization with key partners and stakeholders
- Input from the NCQA Task Force and review and approval by the NCQA Committee on Performance Measurement to ensure harmonization with NCQA Health Plan accreditation requirements
The Consortium continues to conduct research to develop and test survey administration methods that can improve the efficiency of data collection, enhance response rates, and gather more information about the experiences of those segments of the patient population that have been hard to reach through more traditional means. This research includes: 1) studies comparing the effect of administration modes on response rates, survey scores, and data collection costs (e.g., mode comparisons have included in-office distribution vs. mail; email vs. mail); 2) studies assessing the effect of survey length on response rates and survey scores; 3) studies examining the impact of incentives on response rates; and 4) studies comparing the effect of different survey formats and design on survey responses.
To address data collection efficiency and to improve response rates, the CAHPS Consortium endorsed e-mail notification for web-based surveys as an additional mode of data collection. The CAHPS Consortium recommends a mixed mode that would have two e-mail reminders and a follow-up by mail or telephone to all who are in the survey sample. The follow-up to the entire sample is necessary to get a representative set of responses from a practice’s population, as not all patients may have e-mail.
Proprietary Information
Scientific Acceptability
Testing Data
The data used for these analyses are from the 2024 AHRQ CAHPS Health Plan Survey Database which includes data from Adult and Child Medicaid populations.
AHRQ launched the development of the CAHPS Health Plan Survey in 1995 and released the first version for public use in 1997. The development included four field tests (Crofton et al., 1999). Over the years, the CAHPS team has conducted multiple field tests of the Health Plan Survey in geographically diverse sites, analyzed the field test data, and revised the instrument as needed based on the findings.
For evidence of performance gap demonstrating persistent gaps over time, we also include top box statistics on the 2023 HP CAHPS Survey data administered from July 2022 to June 2023. Data were included in the analysis if they had at least one reportable item from the HP CAHPS Survey.
Unless noted otherwise, the top box scores presented are unadjusted since the results are not being used to compare entities, but rather for descriptive and scientific acceptability purposes.
Reference
Crofton, C., Lubalin, JS., Darby, C. (1999). Foreword. Medical Care 37(3), MS1-MS9.
The surveys were collected between July 2023 and June 2024.
None
Health plan level survey results are calculated across the respondents within a health plan. All health plans submitted Adult Medicaid Version 5.1 (233 plans) and Child Medicaid Version 5.1 (234 plans). Adult and child plans in this analysis each come from 49 states and the District of Columbia and Puerto Rico, as shown in Table 5.1.3a available in the Supplemental 7.1 zip file.
A total of 69,505 respondents to the Adult survey and 111,833 respondents to the Child survey (completed by the child’s parent, relative, or legal guardian) are included in the analysis. The Adult survey had an average of 298 respondents per plan, ranging from 20 to 6,462 respondents per plan. The Child survey had an average of 478 respondents per plan, ranging from 23 to 5,234 respondents per plan.
Tables 5.1.4a through g, available in the Supplemental 7.1 zip file, show descriptive characteristics of the respondents by the Adult and Child versions (sex, race/ethnicity, age, self-reported health status, education, survey mode, and survey language). Respondents were predominantly white and non-Hispanic (41%) and older than 54 (43%) for the Adult Survey and respondents were predominantly Hispanic or Latino (35%) for the Child Survey. 36% of respondents in the Adult Survey and 29% in the Child Survey had at least a GED or were a high school graduate. Most respondents completed the survey in English (85% for the Adult Survey and 72% for the Child Survey), and 12% completed it in Spanish for the Adult Survey, while 22% completed it in Spanish for the Child Survey.
Reliability
We estimated internal consistency reliability using the Cronbach’s coefficient alpha for each composite measure. A reliability of at least 0.70 is considered acceptable for group-level comparisons (Nunnally and Bernstein, 1994). For composite measures with more than two items, we show the impact on Cronbach’s alpha of deleting one of the items from the composite measure. However, CAHPS scores are designed to evaluate care across units of care such as plans, or physician groups, not individual patients.
The missing percents for all items were less than 10%. We ran the Cronbach’s alpha excluding all missing data as well as with listwise deletion and the results were the same. Given the similarity of results, we have presented the Cronbach’s alpha values with the inclusion of cases with missing values (listwise deletion) in section 5.2.3.
Reference
Nunnally JC, Bernstein IH. (1994). Psychometric Theory. New York: McGraw Hill.
Tables 5.2.3a1 and 5.2.3a2 (attached in 5.2.3a) show the Cronbach’s alpha for each composite measure in the Adult and Child surveys, respectively. For items within a composite measure consisting of 3 or more items, the Cronbach’s alpha if the item were deleted is provided to determine if there was room for improving coefficient alpha by dropping an item. The table also shows the standardized item to total correlations).
For the Adult survey, two of the four composite measures have alphas higher than 0.70. Two are below criterion at 0.68 (Health Plan Customer Service) and 0.66 (Getting Needed Care). As shown in Table 5.2.3a, removal of any questions in a composite measure would not result in a higher Cronbach's alpha. Further, all the item to total correlations were above 0.40.
For the Child survey, one composite measure had an alpha higher than 0.70 (How Well Doctors Communicate). The other alphas ranged from 0.61 for Getting Needed Care to 0.68 for Health Plan Customer Service. As shown in table 5.2.3a, removal of any questions in a composite measure would not result in a higher Cronbach's alpha since these measures are two item measures. Further, all the item to total correlations were above 0.40 for all items.
Cronbach’s alpha can be sensitive to the number of items in a scale, with more items often leading to higher reliability (Nunnally, 1978). All measures with reliability below 0.70 were two-item measures. While Cronbach’s alpha fell below the conventional threshold for several composite measures, it is not the most critical metric in this context. More important is the reliability at the unit level (e.g., plan-level reliability), which better reflects the measure’s utility for quality improvement. Nonetheless, internal consistency remains a relevant consideration in health care, and the Consortium will keep this in mind when implementing future revisions of the instrument.
Reference
Nunnally, J.C. (1978), Psychometric Theory, 2nd ed. New York: McGraw–Hill
We assess reliability at the health plan (site) level, which is the most relevant level of analysis for publicly reported CAHPS measures (Hays & Arnold, 1986, pp. 144-145). Since CAHPS surveys are used to compare groups/units, site-level reliability (which is directly related to the standard error of measurement) is used to determine the number of responses needed to obtain reliable information (Hays, Shaul, et al., 1999). Site reliability, which partitions within- and between-site variance, was calculated from the ICC and the Spearman-Brown prophecy formula in SAS version 9.4. Higher levels of site reliability correspond to more accurate performance measurement and a better ability to distinguish performance among practices. Reliability is not calculated when only one site is included in the decile or max/min value, since there would be no between-site variation. For cases where the number of respondents was tied across decile boundaries, all health plans with that number of respondents were assigned to the same decile.
Like internal consistency reliability (i.e., Cronbach’s alpha), values of 0.70 and higher are considered acceptable for site-level reliability (Nunnally and Bernstein, 1994) and group comparisons. For example, CMS does not report (labeled as “Not available”) any score with reliability below 0.60, as that is considered low reliability. CMS reports scores that meet the sample size threshold and for which reliability falls between 0.60 and 0.70 but flags these scores as having low reliability and alerts consumers to interpret such scores with caution. Scores with reliability 0.70 or greater are reported without comment. Reliabilities of 0.85 or higher, where possible, are appropriate for applications such as pay-for-performance or actions that reward or classify individual practices.
The CAHPS Consortium has reported the reliability of the CAHPS measures at the appropriate unit of comparison since the beginning of the project over 25 years ago and for measure development throughout the project (e.g., Hays, Martino, et al., 2014; Hays, Berman, et al., 2014; Price, Stucky, et al., 2018).
For the Rating of Specialist measure, the item had less than 10% of respondents missing values. We have presented the site reliability, excluding cases where data are missing.
References
Hays, R, Arnold, S. (1986). Patient and family satisfaction with care for the terminally ill. Hospice Journal, 7, 129-150.
Hays, R, Berman, L, Kanter, M, et al. (2014). Evaluating the psychometric properties of the CAHPS Patient-Centered Medical Home Survey. Clin Ther. 36(5), 689–696.e1.
Hays, RD, Martino, S, Brown, J, Cui, M, Cleary, P, Gaillot, S, Elliott, M. (2014). Evaluation of a care coordination measure for the Consumer Assessment of Healthcare Providers and System (CAHPS®) Medicare Survey. Medical Care Research and Review, 71, 192-202.
Hays, R, Shaul, J, William, V, et al. (1999). Psychometric properties of the CAHPS™ 1.0 Survey measures. Medical Care. 37(3) SUPPLEMENT, MS22-MS31.
Nunnally, JC, Bernstein, IH. (1994). Psychometric Theory. New York: McGraw Hill.
Price, RA, Stucky, B, Parast, L, Elliott, MN, Haas, A, Bradley, M, Teno, JM. (2018). Development of valid and reliable measures of patient and family experiences of hospice care for public reporting. Journal of Palliative Medicine, 21, 924-932.
The Rating of Specialist measure’s overall site level reliability is 0.65. The information in Table 2 provides overall and decile-level reliability. Deciles for this table are based on the total number of respondents per plan, and the “mean performance score” row in this table shows the mean top box scores averaged across health plans
| | Overall | Minimum | Decile_1 | Decile_2 | Decile_3 | Decile_4 | Decile_5 | Decile_6 | Decile_7 | Decile_8 | Decile_9 | Decile_10 | Maximum |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Reliability | 0.65 | NA | 0.61 | 0.38 | 0.42 | 0.50 | 0.66 | 0.65 | 0.50 | 0.56 | 0.76 | 0.83 | NA |
| Mean Performance Score | 72.30 | 67.00 | 72.48 | 69.60 | 72.96 | 72.83 | 71.71 | 73.23 | 72.96 | 71.39 | 72.87 | 73.18 | 68.00 |
| N of Entities | 230 | 1 | 25 | 20 | 23 | 23 | 24 | 22 | 23 | 23 | 23 | 22 | 1 |
| N of Persons / Encounters / Episodes | 27,054 | 21 | 812 | 928 | 1,288 | 1,690 | 2,075 | 2,331 | 2,856 | 3,399 | 4,209 | 6,443 | 1,002 |
The site reliability assessment for the Rating of Specialist measure indicates that for deciles 9 and 10 the measure maintains acceptable reliability (e.g., greater than 0.70). While Overall reliability and that of deciles 5 and 6 fall below 0.70, they are still above the requirement for acceptability in the E&M Guidebook (greater than or equal to 0.60). Reliability scores are below 0.60 for deciles 1-4, and 7-8. The average number of respondents in these lower-reliability deciles were: decile 1:32 respondents, decile 2: 46 respondents, decile 3: 56 respondents, decile 4: 73 respondents, decile 7: 124 respondents, and decile 8: 148 respondents. These are far below the recommendation of 300 completed surveys. In contrast, decile 9 and 10, which met the reliability threshold, had an average of at least 183 respondents. The low average number of respondents for many deciles is primarily due to gate or filter questions for this measure – only respondents who have interacted with a specialist are eligible to answer this item. Since not all enrollees see specialists, plans may need higher respondent counts to achieve better reliability for this measure.
Validity
Several model fit indices were examined to determine how well the hypothesized factor structure, or composite measures, fit the data including chi-square divided by its degrees of freedom (𝝌𝟐/𝒅𝒇) (criteria: values less than 5.0; Schumacker & Lomax, 2004), comparative fit index (CFI) (criteria: values 0.95 or greater; Hu & Bentler, 1999), root mean square error of approximation (RMSEA) (criteria: values less than 0.06; Kline, 2005), and the standardized root mean square residual (SRMR) (criteria: values less than 0.08; Kenny, 2020).
We examined standardized factor loadings for each item on its respective composite measure. Factor loadings above 0.40 indicate that the item’s relationship to the composite measure is acceptable (Stevens, 2002).
References
Hu, L., & Bentler, PM. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1–55. http://dx.doi.org/10.1080/10705519909540118
Kenny, DA. (2020, June 5). Measuring model fit. Available at http://davidakenny.net/cm/fit.htm. Accessed October 2025.
Kline, RB. (2005). Principles and practice of structural equation modeling (2nd ed.) New York: The Guilford Press.
Schumacker R., & Lomax, R. (2004). A beginner’s guide to structural equation modeling
(2nd ed.). Lawrence Erlbaum.
Stevens, JP. (2002). Applied multivariate statistics for the social sciences (4th ed.). Mahwah, NJ: Lawrence Erlbaum.
Tables 5.3.4a and 5.3.4b (attached in section 5.3.4a), shows results for the model fit indices and standardized factor loadings for both adult and child datasets, respectively.
The Confirmatory Factor Analysis results for both the Adult and Child HP CAHPS Surveys demonstrate strong model fit based on established criteria. For the Adult survey, the chi-square divided by degrees of freedom (χ²/df) was 12.52, and for the Child survey, it was 8.5, both exceeding the recommended threshold of less than 5.0, which is common in large samples due to the sensitivity of this index. However, the other fit indices all fall well within acceptable ranges: the Comparative Fit Index (CFI) was 0.99 for Adult and 0.98 for Child, surpassing the criterion of 0.95 or greater, indicating excellent model fit. The Root Mean Square Error of Approximation (RMSEA) was 0.05 for both surveys, meeting the standard of less than 0.06, and the Standardized Root Mean Square Residual (SRMR) was 0.03 for both, comfortably below the threshold of 0.08.
The estimates for each standardized factor loading on the items in the composite measures assess convergent validity. All standardized factor loadings are above 0.5, with the majority above 0.8, and all are statistically significant (p < 0.001), demonstrating the convergent validity of the measures.
These results support the hypothesized factor structure for the measures in the surveys.
At the individual and plan level, we examined the relationships between each composite measure and item’s top box score, and the top box score for the rating measures using Spearman rank-order correlations. Overall, we expect the composite measures to be moderately to strongly related to the rating measures.
We also examined Spearman rank-order intercorrelations among the composite measures to assess the extent to which they measure different constructs. As measures of patient experience, we expected the composite measures to be significantly and positively correlated (e.g., Hays et al., 2018). However, very large intercorrelations (e.g., > 0.80) suggest that the composite measures may not be sufficiently distinct to be considered separate measures (O’Brien, 2007).
One rule of thumb for correlations is:
0.10 is a small correlation
0.30 is a medium correlation and
0.50 is a large correlation.
For the Rating of the Specialist measure, we expect a positive and small to medium relationship with the HP CAHPS composite measures with the strongest relationship with Getting Needed Care, since both measures assess experience with the specialist.
References
Hays, RD., Mallett, JS., Haas, A., Kahn, KL., Martino, SC., Gaillot, S., Elliott, MN. (2018). Associations of CAHPS composites with global ratings of the doctor vary by medicare beneficiaries’ health status. Medical Care, 56(8), 736-739.
O’Brien, RM. (2007). A caution regarding rules of thumb for variance inflation factors. Qual Quant. 41, 673–690.
The Rating of Specialist measure is positively and significantly associated with the four composite measures on the Child HP CAHPS survey. The strongest correlations with Rating of Specialist were observed with Getting Needed Care (r = 0.27 at the plan level) and Health Plan Customer Service (r = 0.25 at the plan level). At the individual level, similar patterns emerged, with small to medium correlations across all measures.
Please see attachment 5.3.4a for validity testing results, which includes the plan-level and individual-level correlations for the Rating of Specialist measure and the composite measure and items in the Child HP CAHPS Survey, all assessed using Spearman rank-order correlations.
Overall, the composite measures were positively and significantly correlated with the Rating of Specialist measure. All correlations between the Rating of Specialist measure and the composite measures fell within an acceptable range at the plan and individual level (between 0.14 and 0.35). The results support the distinctiveness of the Rating of Specialist measure as a measure of patient experience, while also confirming its meaningful relationship to broader aspects of care.
Risk Adjustment
The Child HP CAHPS Survey results are not required to be risk adjusted by users. However, survey users, including public reporting entities, may voluntarily choose to adjust the data to account for patient case-mix differences when comparing plans. Guidance for this process is available in two key documents: “Preparing Data for Analysis” (available at https://www.ahrq.gov/sites/default/files/wysiwyg/cahps/surveys-guidance…) and “Instructions for Analyzing Data from CAHPS Surveys” dated June 2025 (available at https://www.ahrq.gov/sites/default/files/wysiwyg/cahps/surveys-guidance…). These resources provide instructions for coding the adjuster variables, imputing missing data for the adjusters, and incorporating adjustments in analyses using the CAHPS Analysis Program in SAS. The selection of adjuster variables and the calculation of risk-adjusted scores are user-defined. Users must also decide whether to impute missing data for the adjusters using each adjuster’s entity-level mean.
The CAHPS Analysis Program is a set of free SAS programs that enable survey users to conduct risk adjustment. The programs with test modules are available for download at https://www.ahrq.gov/cahps/surveys-guidance/helpful-resources/analysis/….
The CAHPS Analysis Program adjusts the data for case mix, generates a distribution of survey results for each of the measures, calculates the average score for both individual survey items and composite measures, and indicates whether an entity’s scores are statistically different from the average. AHRQ’s CAHPS Consortium developed the CAHPS Analysis Program to work with all CAHPS surveys. It is updated periodically to add functionality, produce additional output types, and correct or debug issues with previous versions.
This section describes the rationale for case-mix adjustment that is not required but CAHPS users may elect to use. The standard methodology used is case-mix adjustment via regression in a linear model. Without an adjustment, differences in CAHPS scores between entities could be due to case-mix differences rather than true differences in quality.
The current CAHPS Analysis Program suggests adjusting for the child’s general health status, the child’s mental health status, respondent’s age, and respondent’s education. Studies have found that patient and consumer survey responses about experiences and satisfaction with healthcare correlate with personal characteristics like general health, mental health/depression, education, and age (Simon et al., 2009; Rahmqvist and Bara, 2010; Zaslavsky et al., 2001; Martino et al., 2011; Elliott et al., 2009).
Health status and age are two respondent characteristics frequently associated with reports of the quality of medical care. People in worse health tend to report lower satisfaction and more problems with care than do people in better health perhaps because sicker patients have more complex health care needs and may tend to report more problems with coordination or communication. Older patients tend to report greater satisfaction, better patient experience, and fewer problems than younger patients, although this association is usually not as strong as that between health status and ratings (Hatfield and Zaslavsky, 2017; Eselius et al., 2008).
Education is self-reported by respondents who take the CAHPS surveys. Studies have shown that more educated respondents report more problems, perhaps because they have higher expectations rather than because they receive lower quality care (Sofaer and Firminger, 2005). However, in a multivariate analysis using Medicare Advantage CAHPS data, Hatfield and Zaslavsky (2017) found that education had less influence on CAHPS dimension scores than self-reported general and mental health.
Different CAHPS surveys adjust for different variables, and the variables included here are not the only adjustment factors, even for the health plan setting. It should be noted that race/ethnicity is not typically included as case-mix adjuster variables as doing so may mask disparities in care experienced. CAHPS data can also be adjusted for other factors such as survey administration mode (Peipert et al., 2017). For example, a study by Drake and colleagues (2014) found that telephone respondents gave more positive responses than mail respondents did. Currently, the AHRQ HP CAHPS Database does not adjust for survey mode.
The AHRQ HP CAHPS Database adjusts only for respondent age, respondent education, the child’s general health status, and the child’s mental health status.
References
Drake, KM, Hargraves, JL, Lloyd, S, Gallagher, PM, Cleary, PD. (2014). The effect of response scale, administration mode, and format on responses to the CAHPS Clinician and Group Survey. Health Serv Res. 49(4), 1387-99. doi: 10.1111/1475-6773.12160.
Elliott, MN, Zaslavsky, AM, Goldstein, E, Lehrman, W, Hambarsoomians, K, Beckett, MK, Giordano, L. (2009). Effects of survey mode, patient mix, and nonresponse on CAHPS hospital survey scores. Health Serv Res. 44(2 Pt 1), 501-18. doi: 10.1111/j.1475-6773.2008.00914.x.
Eselius, LL, Cleary, PD, Zaslavsky, AM, Huskamp, HA, Busch, SH. (2008). Case-mix adjustment of consumer reports about managed behavioral health care and health plans. Health Research and Educational Trust. 43(6), 2014-2032.
Hatfield, LA, Zaslavsky, AM. (2017). Implications of variation in the relationships between beneficiary characteristics and Medicare Advantage CAHPS measures. Health Serv Res Aug;52(4),1310-1329.
Martino, SC., Elliott, MN., Kanouse, DE., Farley, DO., Burkhart, Q., Hays., RD. (2011). Depression and the health care experiences of Medicare beneficiaries. Health Services Research. 46(6pt1), 1883–904.
Peipert, JD, Brown, JA, Cui, M, Hays, RD (2017). Differences in mail and telephone responses to the CAHPS In-Center Hemodialysis Survey. Ann Clin Nephrol Vol.1 No.1: 1.
Rahmqvist, M, Bara, AC. (2010). Patient characteristics and quality dimensions related to patient satisfaction. International Journal for Quality in Health Care. 22(2), 86–92.
Simon, GC, Rutter, M, Crosier, J, Scott, BH, Operskalski, LE. (2009). Are comparisons of consumer satisfaction with providers biased by nonresponse or case-mix differences? Psychiatric Services. 60(1), 67–73.
Sofaer, S, Firminger, K. (2005). Patient perceptions of the quality of health services.
Annual Review of Public Health. 26, 513–59.
Zaslavsky, A, Zaborski, L, Ding, L, Shaul, JA, Cioffi, MJ, Cleary, PD. (2001) Adjusting performance measures to ensure equitable plan comparisons. Health Care Financing Review. 22(3), 109–26.
Tables 5.4.3a through 5.4.3c in Attachment 5.4.3, show descriptive characteristics for the risk/case mix variables tested: respondent age, child’s general health status, child’s mental health status, and respondent education. Health plans had 30% of respondents who were 35-44 years old. Forty-three percent (43%) of respondents reported having some college education or higher. Sixty-six percent (66%) of respondents reported their child being in excellent or very good general health, and 59% of respondents reported their child having excellent or very good mental health.
Case-mix adjustment was conducted using the AHRQ CAHPS Analysis Program (https://www.ahrq.gov/cahps/surveys-guidance/helpful-resources/analysis/…). In the CAHPS Analysis Program, we set the program to impute the plan mean if data were missing for that variable to avoid losing observations because of missing data. This is generally acceptable because the size of the adjustment and the amount of missing data on adjusters are typically small. The case-mix adjusted scores are based on regression analyses and the results include case-mix adjusted coefficients. For detailed information on case-mix adjustment refer to page 46 of “Instructions for Analyzing CAHPS Data” document (https://www.ahrq.gov/sites/default/files/wysiwyg/cahps/surveys-guidance…).
To quantify the effect of case-mix adjustment on the ranking of health plans, we calculated the Pearson product moment correlation coefficient, the Kendall Tau correlation coefficient, and the maximum difference between the adjusted and unadjusted plan mean scores as shown in Attachment 5.4.5a. The adjustment factors include respondent age, respondent education, the child’s mental health status and the child’s general health status. Appropriate case-mix adjusters are respondent- and patient-level variables that are not under the control of the provider (“exogenous”) but result in different scores even when the quality of care is the same. For example, respondent education and age are not under the control of the provider.
Eselius and colleagues (2008) published results of an analysis of case-mix adjustments for the HP CAHPS Survey. After selecting appropriate adjusters based on explanatory power in separate linear regression models, the authors determined the impact of case-mix adjustment on their sample health plans. Specifically, they examined the size of the adjustments and the extent to which adjustments impacted the ranking of health plans. Case-mix adjustments had only modest effects on health plan ratings and rankings. The authors also found that mental health status was a strong predictor of patient-reported experience.
The Pearson product-moment correlation coefficient is widely used and understood. It assesses the linear association between the adjusted and unadjusted scores and ranges between -1 to 1. Because the ranking of scores is often important in public reports of CAHPS results, we also calculate Kendall’s Tau (Kendall rank correlation coefficient). Tau is the correlation between the rank orders of the adjusted and unadjusted scores. The Kendall Tau statistic also has a range of -1 to +1, so that it has a range comparable to other correlation coefficients. Tau can be interpreted as the percentage of pairs of units (e.g., health plans) that switched ordering because of case-mix adjustment [100*(1-Tau)/2].
We also calculated the maximum absolute value difference between adjusted and unadjusted mean scores among health plans.
Reference
Eselius, LL., Cleary, PD., Zaslavsky, AM., Huskamp, HA., & Busch, SH. (2008). Case-mix adjustment of consumer reports about managed behavioral health care and health plans. Health Research and Educational Trust. 43(6), 2014-2032.
The selection of case-mix adjusters is based on prior research across multiple datasets for the CAHPS Health Plan Survey Database and is consistent with how scores are calculated. Results are found in Attachment 5.4.4a and 5.4.5a. All adjusters were statistically significant in the regression models.
The adjusters resulted in very similar top box scores, with the Pearson correlation of unadjusted and adjusted mean scores being close to 1, at 0.95 and the Kendall Tau correlation at 0.79. Further, the maximum difference in the mean score for this measure was 0.29. This suggests that adjusting for case-mix can help level the playing field for this measure.
The final model includes four adjusters: respondent age, respondent education, child’s general health status, and child’s mental health status.
Use & Usability
Use
OPM assesses the annual performance of health plans contracted under the FEHB program. Each year, FEHB plans send the adult version of the CAHPS® survey to a sample of plan members to evaluate their plan experiences.
The FEHB Program provides private health insurance to about 8.3 million federal employees, retirees, and their dependents across the United States. There are approximately 180 health plan choices.
Level of analysis – health plans
The purpose of publishing rankings and the accreditation program is to make quality information on health plans available to consumers and certifies that health plans meet basic requirements for consumer protection and quality improvement (adult and child measures).
In 2024, NCQA rated over 1,000 plans across the United States. NCQA lists private (commercial), Medicare, and Medicaid health insurance plans based in part on their CAHPS® scores. Number of patients: Information not available.
Level of analysis – health plans
CMS publicly reports plan-level CAHPS scores for consumers of Medicare Advantage Plans and Part D Prescription Drug Plans. The results from the Medicare CAHPS surveys (Adult version) are published in the Medicare & You handbook each Fall and on the Medicare Web site.
Approximately 600 health plans across in the United States. Number of patients: Information not available however estimates show approximately 32.8 million patients enrolled in Medicare Advantage.
Level of analysis- state level, health plans
The Health Insurance Marketplace conducts the Qualified Health Plan (QHP) Survey, a version of the Adult CAHPS Health Plan Survey, to provide consumers with valuable information about health plan quality and help them make informed decisions when selecting a plan.
All qualified health plans (QHPs). In 2025 there were 206 QHPs. For 2025, CMS requires that QHP issuers use a HEDIS® Compliance Auditor and follow the HEDIS® Compliance Audit standards to validate the QHP Enrollee Survey sample frame. Number of patients: Information not available.
Level of analysis - health plans
To facilitate comparisons of CAHPS survey results by and among survey sponsors. This compilation of survey results from a large pool of survey users into a national database enables participants to compare their own results to relevant benchmarks.
The 2024 CAHPS Database includes data from 233 Adult Medicaid Health Plans, 234 Child Medicaid Health Plans, and 48 Children's Health Insurance Programs (CHIP). Number of patients: Information is not available.
Level of analysis- individual aggregate and health plan
To promote the objectives of ensuring access to high-quality care through standardized set of measures to assess the quality of care provided to Medicaid and CHIP beneficiaries.
A subset of data submitted to the 2024 AHRQ CAHPS Database includes data from 42 states for Adult Medicaid, 47 states for Child Medicaid and CHIP. Number of patients: Information not available.
Level of analysis- state
Usability
Actions to Improve Patient Experience
CAHPS® surveys play an important role as a quality improvement (QI) tool for healthcare organizations that use the standardized data to:
- Identify relative strengths and weaknesses in their performance.
- Determine where they need to improve.
- Track their progress over time.
AHRQ has made available a CAHPS Ambulatory Care Improvement Guide which is a comprehensive resource for health plans, medical groups, and other providers seeking to improve their performance in the domains of patient experience measured by CAHPS surveys. AHRQ also has created a short video to help improve patient experience https://www.ahrq.gov/cahps/quality-improvement/index.html#:~:text=CAHPS…;
The steps are:
- Compare CAHPS survey scores to other health care organizations to determine how the plan is doing in comparison to others.
- Examine how CAHPS scores are changing over time.
- Identify priorities based on these comparisons
- Confirm these priorities based on other sources of information (e.g., patient complaints, patient comments)
- Find out what is actually happening with patients and why.
- Brainstorm with staff to determine the best strategies for improvement.
In addition, AHRQ held a research meeting in 2020 to discuss how to improve patient experience and provided summaries of the presentations: https://www.ahrq.gov/cahps/news-and-events/events/2020-meeting-summary….
Difficulty in Increasing Response Rates
Users are also provided advice for improving response rates (AHRQ, 2008):
- Improve initial contact rates by making sure that addresses and phone numbers are current and accurate (e.g., identify sources of up-to-date sample information, run a sample file through a national change-of-address database, send a sample to a phone number look-up vendor).
- Use all available tracking methods (e.g., Lexis-Nexis, Internet database services and directories).
- Improve contact rates after data collection has begun (e.g., increase maximum number of calls, ensure that calls take place at different day and evening times over a period of days, mail second reminders, use experienced and well-trained interviewers).
- Consider using a mixed-mode protocol. In field tests, the combined approach was more likely to achieve a desired response rate than did one mode alone.
- Train interviewers on how to deal with gatekeepers.
- Train interviewers on refusal aversion/conversion techniques.
AHRQ has several resources for health plans to improve performance on the CAHPS Surveys. They created the CAHPS Ambulatory Care Improvement Guide, which is a comprehensive resource for health plans seeking to improve their performance: https://www.ahrq.gov/cahps/quality-improvement/improvement-guide/improv…. There are also case studies and webcasts that share insights and best practices in improving patient experience with care: https://www.ahrq.gov/cahps/surveys-guidance/hp/improve/index.html.
To improve on Rating of the Specialist, health plans can encourage specialists to keep some appointments available each day for urgent visits, treat patients and the parents/guardians with empathy and respect, make eye contact, take the time to listen carefully, and ask if all questions have been addressed. Further, they can encourage using the teach-back method to ensure the patient or parent/guardian’s understanding. The teach-back method has the patient or parent/guardian to explain, in their own words, what they heard and what they need to do after leaving. Information and resources on the teach-back method are available on the AHRQ website (https://www.ahrq.gov/patient-safety/reports/engage/interventions/teachb…).
As part of CAHPS development and maintenance, the CAHPS Consortium has sought input from multiple users, including accreditors, health plans, and the public. Throughout the development process, the CAHPS Consortium has incorporated the data or input from these various sources in an incremental process of revision and refinement to develop measurement that is more precise and to produce survey data that would better meet the information needs of consumers and other stakeholders. The CAHPS Consortium hears user feedback during research studies and development. Users can contact the CAHPS Database team with questions or comments by phone at 888-808-7108 or email at [email protected]. The CAHPS consortium also solicits feedback via focus groups with patients in developing survey content and design. We are not aware of any substantial problems experienced by health plans or respondents.
No additional detail.
The current 5.1 version of the instrument includes revisions based on feedback received since the development of the HP CAHPS Survey. A few examples of feedback used to revise the instrument include changes made when creating version 5.0 which incorporated some minor changes into the wording of core items based on input gathered in consultation with stakeholders. For example, questions about access to urgent and non-urgent appointments were modified to ask respondents if they were able to get an appointment “as soon as they needed,” rather than as soon as “they thought” they needed for consistency across all CAHPS surveys. Similarly, the item about how often it was easy to get care was moved from the “Your Health Plan” section to the “Your Health Care” section because respondent feedback was that they had difficulty attributing this item to the health plan. For the 5.1 version, with the COVID-19 pandemic changing how some health care was delivered (e.g., video or phone rather than in-person), the instrument was updated again to change instructions and gate question wording to include these types of visits.
No additional detail.
The 2024 AHRQ CAHPS Health Plan Survey Chartbook presents aggregated results from 2014 to 2024 (https://www.ahrq.gov/sites/default/files/wysiwyg/cahps/cahps-database/2…). Overall, Child HP CAHPS Survey scores were gradually improving until around 2020/2021 when the COVID-19 pandemic occurred, and scores started to decline. Scores continued to drop until this past year, when they began to show signs of recovery.
Scores for Rating of the Specialist have gradually improved from 2014 through 2020, going from 70% to 74%. However, performance declined to 71% in 2023, followed by a modest increase to 72% in 2024. This trend suggests a drop in performance between 2020 and 2023, with limited improvement in the most recent year. Scores remain below 2020 levels, highlighting the need for targeted strategies to strengthen patient experience, particularly in strengthening the relationship between the patient/parent/guardian and specialists.
No unexpected findings.
No unexpected findings.
Comments
Staff Preliminary Assessment
CBE #0006-15 Staff Preliminary Assessment
Importance
Strengths
- Data from 2024 show a performance gap, with top-box score decile ranges from 62.1% to 82.6%, indicating variability in performance between health plans and less than optimal performance across all health plans (ideal performance is 100%).
Limitations
- The logic model provided does not clearly articulate the relationships between inputs, activities, and outcomes. For example, the logic model does not clearly depict how the HP-CAHPS Rating of Specialist measure leads to improvements in clinical quality. Further, the logic model does not include assumptions, external factors, or feedback mechanisms. The submission could be strengthened by more clearly depicting the relationship between the HP-CAHPS Rating of Specialist and inputs, activities, outputs, and outcomes. It could also be strengthened by stating assumptions, external factors, and feedback mechanisms.
The evidence review includes three literature reviews from 2013, 2024, and 2021 which show patient experience measures are related to clinical outcomes. However, it does not include empirical evidence linking Rating of Specialist to clinical outcomes. The studies cited are not specific to parents’/guardians’ ratings of specialists providing care to their children.
Patient input is either not sufficiently sought or does not clearly support the conclusion that the measure is meaningful. The measure developer cites three empirical studies from 2007 and 2009 which demonstrate patients use patient experience measures to make decisions. However, these data are not specific to measures reporting patients’ Rating of Specialist. They are also not specific to parents’/guardians’ ratings of specialists providing care to their children. The degree of certainty from patient input is low.
The submission could be strengthened by incorporating more findings from the CAHPS Consortium’s literature review, focus groups, Technical Expert Panel, and other development activities which show the importance of this measure construct to patients and/or a link between parents’/guardians’ ratings of specialists providing care for their children and health outcomes. It could also be enhanced by including more recent literature related to the measure focus.
Rationale
- The maintenance measure is rated as 'Not Met But Addressable for importance due to a non-specific logic model and insufficient patient input/meaningfulness. Enhancements, including a greater focus on the construct addressed by this measure (rating of specialist), a more specific focus on care provided to children (the focus of this measure), and a more robust description of efforts to ensure patient meaningfulness could elevate its importance.
Closing Care Gaps
The developer did not address this optional domain.
Feasibility Assessment
Strengths
- This is a patient reported experience performance measure (PRE-PM). These data are collected from parents/guardians of children patients and are not available in a structured source outside the health plan or sponsoring organization.
The survey can be administered electronically, however, the developer states that non-electronic response options should be available for enrollees with limited internet access. Mail is the most frequent mode for CAHPS surveys.
The developer indicates that the only change to the instrument was a change to survey wording that allowed respondents to consider care that was provided virtually (i.e. by phone or video). They assert this change did not impact data structure or availability.
The developer addresses burden associated with data entry, validation, and analysis. They discussed electronic feasibility, missing data, susceptibility to inaccuracies, and ability to audit data. They note the survey takes about 15 minutes to complete, depending on the respondent. Sampling uses administrative and enrollment data that is maintained by the health plans. Because the measure is collected outside of healthcare encounters, there is no impact on patient-physician interactions.
The developer described how all required data elements can be collected without risk to patient confidentiality, including administering the survey so the data are de-identified upon collection, only reporting responses in aggregate form, and not reporting results if there are fewer than 10 respondents.
There are no fees, licensing, or other requirements to use any aspect of the measure (e.g., value/code set, risk model, programming code, algorithm).
Limitations
- The feasibility domain can be strengthened by providing the median cost of vendor engagement to administer CAHPS (or a similar metric).
Rationale
- This maintenance measure meets all criteria for “Met” for feasibility due to its well-documented feasibility assessment, clear and implementable data collection strategy, and transparent handling of patient confidentiality, burden, licensing, and fees. These factors collectively ensure that the measure can be implemented effectively and sustainably in a real-world healthcare setting.
Scientific Acceptability
Strengths
- Data sources used for reliability analysis are adequately described and include a database with survey results collected from July 2023 through June 2024.
The developer conducted reliability testing using the ICC and the Spearman-Brown prophecy formula at the accountable entity-level.
Limitations
- The developer performed reliability testing for this maintenance measure, namely, they conducted accountable entity-level reliability testing at the site level using the unadjusted measure scores rather than the case-mix adjusted measure scores with the rationale being case-mix adjustment is not needed when entities are not compared to each other.
The entities included in the testing were characterized by practice site and a minimum sample size of 20 completed surveys. Developer states approximately 300 completed surveys per practice are needed for statistically reliable results and does not give a rationale for minimum sample size of 20.
The percentage of sites meeting the expected threshold of 0.6 for split-half reliability was unclear from the measure submission.
Rationale
- This maintenance measure is rated as ‘Not Met But Addressable’ for reliability because the developer performed the required reliability testing for this measure but it is unclear whether the results demonstrate sufficient reliability at the accountable entity-level. However, the identified limitations are deemed addressable, as the developer may consider using case-mixed adjusted data for reliability testing and providing their rationale for a minimum sample size of 20 completed survey per practice site. By addressing these issues, there is potential to demonstrate sufficient reliability at the accountable entity-level.
Strengths
- The developer performed the required validity testing for this maintenance measure, namely, they conducted person (“data element”) validity testing for all critical data elements and accountable entity-level (“measure score”) validity testing at the health plan level. The data source used for validity analysis was AHRQ CAHPS Health Plan Survey Database administered to parents or guardians of Medicaid beneficiaries aged 0-17 from July 2023 through June 2024. Data included 111,833 respondents from 234 health plans in 49 states, the District of Columbia, and Puerto Rico.
The developer conducted empirical validity testing at the accountable entity level using Spearman’s rank-order correlation on unadjusted top box scores. The developer hypothesized that the Rating of Specialist would be positively and significantly correlated at weak to moderate magnitude with the four composites, having the strongest correlation with Getting Needed Care, reasoning that both measures assess experience with specialists. The developer posited that measures of patient experience should be correlated, but not so highly correlated as to suggest the measures are not distinct, citing rho > 0.80 as an example of a very large correlation. Rating of Specialist was significantly, positively correlated at the health plan level with all four composites (rhos ranged from 0.14 to 0.27), with slightly stronger correlation with Getting Needed Care (rho = 0.27), in line with the hypothesis.
The developer conducted statistical case-mix adjustment, selecting case-mix indicators that are present at the start of care and have a significant correlation with the outcome.
Limitations
- With respect to entity-level validity testing, while the developer has indicated that quality improvement activities can impact more than one measure, their hypotheses regarding the mechanisms involved and the degree to which these mechanisms are shared between IDMs is not clearly articulated. In the absence of an external gold standard against which to validate at least one of the IDMs, this submission would be strengthened by additional support in the logic model and evidence review guiding development of hypotheses about expected magnitudes of each correlation.
The developer used unadjusted scores for validity testing when use of adjusted scores might help rule out known sources of confounding.
The developer states that case-mix adjustment of the measure is optional by the user, and does not provide guidance or supporting rationale stating when adjustment is or is not appropriate. The developer did not provide evidence demonstrating variation in the prevalence of case mix factors across accountable entities. The statistical testing results provided by the developer do not reflect the impact of adjustment on providers at the high or low extremes of the case mix.
Rationale
- This maintenance measure is rated as ‘Not Met But Addressable’ for validity because the accountable-entity validity testing results partially support an inference of validity for the measure, suggesting that the measure somewhat accurately reflects performance on patient experience of care and can distinguish good from poor performance to a limited extent. This submission would be strengthened by explicitly ruling in mechanisms and ruling out confounders for the effect of health plan quality on survey respondents' Ratings of Specialist.
The developer employed a statistical case-mix adjustment approach, utilizing a conceptual model designed to account for demographic case-mix factors. Variation in the prevalence of case-mix indicators across different entities was not shown and the model testing results provided do not reflect whether case-mix differences are being appropriately accounted for.
Use and Usability
Strengths
- The measure is currently used in the Office of Personal Management Federal Employees Health Benefits (FEHB) Health Plan Performance Assessment project, NCQA Health Insurance Plan Ranking and Health Plan Accreditation, CMS Medicare Advantage (MA) and Prescription Drug Plan (PDP) Program, Patient Protection and Affordable Care Act – CMS Exchange and Insurance Market Standards/ Quality Rating System, Agency for Healthcare Research and Quality CAHPS Database, and CMS Core Measure Reporting.
The developer provides a summary of how accountable entities can use the measure results to improve performance, drawn from the CAHPS Ambulatory Care Improvement Guide. Specifically, Health Plans can invest in hiring staff who are service-minded and provide training on enrollee services so they can provide accurate information. Health plans should also listen to and act on enrollee complaints.
The developer seeks input from users including accreditors, health plans, and the public. The developer reports they have made minor changes to the wording of the instrument to reflect feedback from stakeholders, such as adding language to include care delivered by video or phone in addition to in-person care.
The developer reported no unexpected findings.
Limitations
- The developer reported changes in performance from 70% in 2014 to 74% in 2020. However, they note performance declined to 71% in 2023 and increased to 72% in 2024. The developer does not provide an explanation for this decrease. They assert this highlights the need for strengthening relationships between patients/parents/guardians and specialists.
The developer summarized how accountable entities can use the measure results to improve performance, but the submission could be enhanced by including these in the measure's logic model.
Rationale
- This maintenance measure is rated ‘Not met, but addressable'. The measures shows variability in performance from 2014-2024. However, the application could be strengthened by offering an explanation of mean level decreases in performance and the subsequent rebound. The developer reported no unexpected findings.
Committee Independent Review
Endorse
Importance
As with most of these, this is an important measure in understanding at a directional level the satisfaction of specialists.
Closing Care Gaps
I don't think is addressed
Feasibility Assessment
It is feasible, but for patients with multiple specialists how is that distinction made to the patient or family? If they see both an APP and physician, or only one or the other, how would that be made clear in the assessment?
Scientific Acceptability
Agree with staff assessment
Agree with staff assessment
Use and Usability
As with a lot of these measures, I'm not sure they get to the level of evaluation to really drive specific tactics and change.
Summary
It is an important measure and brings value, but could benefit from some optimization.
Endorse
Importance
Endorse
Closing Care Gaps
Endorse
Feasibility Assessment
Endorse
Scientific Acceptability
Endorse
Endorse
Use and Usability
Endorse
Summary
No concerns
Specialist Measure
Importance
Rating of Specialist reflects a critical aspect of children’s care, as specialists often manage complex or ongoing conditions requiring trust, coordination, and clear communication with parents or guardians. The presence of a 2024 performance gap (top-box range 62.1%–82.6%) demonstrates meaningful variation across health plans and indicates room for improvement. However, the current submission does not sufficiently connect this measure to downstream clinical outcomes for children, nor does the logic model clearly explain how improvements in specialist ratings translate into improved health or care coordination. In addition, patient input specific to parents’ or guardians’ experiences with pediatric specialists is limited. These gaps are addressable through a more targeted logic model and stronger, child-specific evidence of meaningfulness.
Closing Care Gaps
The developer did not address the optional Closing Care Gaps domain. While Rating of Specialist has potential to highlight disparities in access to and quality of specialty care for children, the submission does not describe how results are used to identify or close gaps across populations, geographies, or subgroups. Explicit articulation of how this measure could inform targeted interventions would strengthen its contribution to care gap reduction.
Feasibility Assessment
This measure demonstrates strong feasibility. Data are collected through an established, standardized CAHPS survey administered outside of clinical encounters, minimizing burden on providers and families. Multiple administration modes, including mail and electronic options, support equitable participation. The developer clearly addresses confidentiality protections, data validation, auditing, and respondent burden, and there are no licensing or usage fees. These features support sustainable and consistent implementation across health plans.
Scientific Acceptability
The developer conducted required entity-level reliability testing using accepted statistical methods. However, testing relied on unadjusted scores and included entities with a minimum sample size of 20 completed surveys, without sufficient justification. In addition, it is unclear how many sites met the expected reliability threshold. These limitations do not negate the value of the measure but should be addressed through clearer rationale for sample size thresholds and use of adjusted scores where appropriate to strengthen confidence in reliability.
Validity testing shows that Rating of Specialist is moderately correlated with related access and experience measures, consistent with stated hypotheses. However, the submission does not clearly articulate the mechanisms linking specialist ratings to broader care quality or outcomes for children, nor does it adequately address potential confounding through case-mix adjustment. Additional clarity in the logic model, stronger justification of hypotheses, and clearer guidance on when adjustment is appropriate would strengthen the validity argument.
Use and Usability
This measure is widely used across federal and accreditation programs and provides actionable insight into families’ experiences with specialty care. However, the submission does not explain recent performance declines or clearly link results to sustained improvement strategies within the logic model. Greater transparency regarding performance trends and clearer integration of improvement actions would enhance usability and trust in the measure.
Summary
I support CAHPS and believe the Rating of Specialist measure captures an important dimension of pediatric care. While the measure has a strong foundation and broad use, its importance and scientific acceptability would benefit from clearer child-specific logic modeling, stronger parent/guardian-centered evidence, and more explicit articulation of how results drive improvement. With these enhancements, the measure has clear potential to fully meet endorsement criteria.
0006-15
Importance
No Comments
Closing Care Gaps
No Comments
Feasibility Assessment
No Comments
Scientific Acceptability
No Comment
No Comment
Use and Usability
No Comment
Summary
No Comment
High Importance, Case-Mix Adjustment Approach Acceptable
Importance
Disagree with staff assessment. The wide distribution of scores overall suggests that patients have identified clear issues with the health care system reflected in their responses here, and several studies are provided supporting the link between patient experience with their health care providers, including specialists and clinical quality outcomes.
Closing Care Gaps
N/A
Feasibility Assessment
Agree with staff assessment. Measure is clearly feasible and widely used.
Scientific Acceptability
Disagree with staff assessment. I do not anticipate issues with reliability of the measure score if it was re-ran with case mix adjusted figures, as the developer provided evidence that case mix adjustment has only a modest effect on the measure score, and in any case is an optional “add-on” to the existing measure score calculation. A mitigation strategy is presented for low reliability entities.
Disagree with staff assessment. I do not find that the issues presented with the case mix adjustment specific to validity are sufficient to affect the rating of the criterion, as these are optional adjustments and largely under the discretion of the individual adjuster.
Use and Usability
Disagree with staff assessment. Although there is little explanation provided for changes in performance, the changes in performance are slight and could be consistent with a positive upward drift.
Summary
Although the submission could be strengthened in some areas, the specific weaknesses are not sufficient to threaten the continued endorsement of this measure. This measure is a rare source of patient-reported data about the health care system, and reflects the performance of entities that are increasingly critical in guiding the course of health care in the United States, as health plans assume ever greater levels of control over provisioning care for their members.
Not met
Importance
There ought to be evidence that the measure not only measures rates the health plans, but that the rating influences the plan's activities. Many will see that proposal as unrealistic; however, if that is the case, why are we measuring? The developer ought to be drawing a line between the measurement and plan outcomes. If they cannot do that, the measure is unimportant.
Further, as indicated by the staff assessment, the developers cite old studies, present no new data, and do not provide any real evidence of importance. The developers need to do the work to establish importance. The measures have been around long enough to have a history of importance--if such importance exists--and be able to show a relationship between measurement and outcomes--if such a relationship exists.
Closing Care Gaps
This measure ought to be closing care gaps. The fact that the developer is not addressing this criterion reinforces the posit that it is not important, either.
Feasibility Assessment
The measure's feasibility ought to reflect not only its data collection possibilities, but also whether it has a benefit relative to its cost--including its personnel time and respondents' time. It provides no such data. For as long as it has been in use, such data ought to be available. If the measure is truly feasible, the developer ought to present the data with pride and vigor. That is not the case.
Scientific Acceptability
Consistent with the staff assessment, I agree that the developer submitted reliability data; however, examination of the measure's reliability ought to address all instances in which users employ the measure. As reported, this measure lacks such reliability measurement.
I am in complete concurrence with the staff assessment; however, given the period over which users have employed the measure the developer ought to have evidence on its usability that goes beyond outlining where organizations use it. If a measure is usable, it stands to reason that people have no difficulty in its use and that there is evidence to support that lack of difficulty. The developers present none of that.
Use and Usability
I am in complete concurrence with the staff assessment; however, given the period over which users have employed the measure the developer ought to have evidence on its usability that goes beyond outlining where organizations use it. If a measure is usable, it stands to reason that people have no difficulty in its use and that there is evidence to support that lack of difficulty. The developers present none of that.
Summary
Considering how long this measure has been in use, the developers ought to be able to produce an adequate supply of current measure data to support importance, feasibility, acceptability, and use and usability. They do not. This measure is not acceptable in its current form.
Support
Importance
No additional comments.
Closing Care Gaps
Optional item not submitted by measure developer.
Feasibility Assessment
No additional comments.
Scientific Acceptability
Measure developer can address the items noted in staff preliminary assessment.
Measure developer can address the items noted in staff preliminary assessment.
Use and Usability
No additional comments.
Summary
No additional comments.
summary
Importance
.
Closing Care Gaps
.
Feasibility Assessment
.
Scientific Acceptability
.
.
Use and Usability
.
Summary
Disagree with a rating scale of 1-10. All the various criteria that an individual would have to consider in distinguishing between the number ranges would make this survey ineffective. Allow clear options for individuals to choose and only limit to 3 choices- below average, average, above average. Definitions for below/avg/above should also be clearly delineated to reduce ambiguity and subjective nature of the question.
Public Comments
0006 Consumer Assessment of Healthcare Providers and Systems (CA
The American Medical Association appreciates the opportunity to comment on this survey and its associated composite measures. We are very concerned to see that testing at the measure score level was not provided for any of the composite measures. It is our understanding that measures that are undergoing endorsement maintenance should provide this level of testing. We believe that it is important for health plans, physicians, patients and caregivers, and others to know how the measures perform, particularly since they are used in many accountability programs and this information has been provided in past submissions. In addition, several of the composite measures for the adult and child versions produced Cronbach’s alpha internal consistency reliability results below 0.7, which is less than desirable.
Given the lack of testing at the measure score level and lower Cronbach’s alpha results for some of the composite measures, we do not believe that the minimum endorsement criteria have been met and endorsement should be reconsidered until these concerns are addressed.
Response to AMA
We are not exactly sure what is meant by “testing at the measure score level was not provided for any of the composite measures.” We provide performance scores by decile at the health plan level under the “Importance” tab, then select Performance Gap. Regarding the internal consistency reliability falling below 0.70 for several measures, this only happened for the measures including just two (2) items. Reliability is influenced by both item correlation and the number of items; a measure with a greater number of items would likely have higher reliability but would also increase respondent burden. In addition, internal consistency reliability is only one of the criteria for determining scientific acceptability, and the measure meets other required criteria, so it is important to examine the totality of the evidence rather than a single metric in isolation.