Advertisement

Standardizing fatigue measurement in multiple sclerosis: the validity, responsiveness and score interpretation of the PROMIS SF v1.0 – Fatigue (MS) 8a

Open AccessPublished:June 28, 2021DOI:https://doi.org/10.1016/j.msard.2021.103117

      Highlights

      • Current questionnaires used to measure fatigue in patients with MS exhibit varied robustness in psychometric properties.
      • We evaluated the PROMIS Fatigue (MS) 8a as a measure of fatigue severity in MS.
      • The instrument demonstrated strong validity, test-retest reliability and responsiveness.
      • The psychometric properties of the PROMIS Fatigue (MS) 8a as reported in the current study are generalizable to patients with MS not requiring a scooter or wheelchair for mobility.
      • Our data extend evidence supporting this fatigue measure across MS populations.

      Abstract

      Background

      Fatigue is one of the most common and the single most disabling symptom of multiple sclerosis (MS). However, there is a lack of consensus on the most appropriate fatigue measures in clinical practice and research, based upon rigorously validated, generalizable, and publicly available instruments. The objective of this research was to generate additional evidence regarding the validity and applicability of the PROMIS SF v1.0 – Fatigue (MS) 8a, including content validity, reliability, construct validity and responsiveness, as well as to assess minimal important difference (MID) estimates and a score interpretation tool to aide meaningful individual level score interpretation.

      Methods

      A mixed-methods, sequential design was followed. Cognitive debriefing (CD) interviews (n=29) were performed with MS patients, to assess the relevance and comprehensiveness of the PROMIS Fatigue (MS) 8a scores. To evaluate the psychometric properties of the PROMIS Fatigue (MS) 8a, two observational studies were conducted: a cross-sectional study at two US MS centers (n=296), and a 96-week longitudinal study in a UK MS Register cohort (n=384). Main outcomes and measures were estimates of known-groups validity, convergence validity, reliability, and responsiveness, a guide for interpreting PROMIS Fatigue (MS) 8a T-scores, and anchor-based MID estimates.

      Results

      The CD interviews confirmed the comprehensiveness and relevance of the PROMIS Fatigue (MS) 8a in assessing MS fatigue. Cronbach's alpha (>0.9) and intra-class correlation coefficient (≥0.9) for test-retest scores at 5–7 days follow-up, supported strong internal consistency and test-retest reliability. Hypothesized differences were found across patient groups in patient reported fatigue and related concepts (analysis of variance [ANOVA], P <0.001). PROMIS Fatigue (MS) 8a scores were sensitive to bi-directional changes in fatigue (GHS fatigue global question) and physical health (PROMIS GHS GPH), over a 52-week follow-up. Score changes of 3.4–4 points are proposed as MID criteria for minimal improvement or worsening in fatigue.

      Conclusion

      This research extends the evidence supporting the content validity and the robust psychometric performance of the PROMIS Fatigue (MS) 8a across US and UK MS populations. Importantly, data supporting the measure's integration in clinical practice and research, including meaningful score interpretation, are now available.

      Keywords

      Abbreviations:

      CD (cognitive debriefing), ES (effect size), FSS (fatigue severity scale GHS, global health score), GPH (global physical health), MFIS (modified fatigue impact scale), MID (minimal important difference, MSIS, multiple sclerosis impact score), MSWS (multiple sclerosis walking scale)

      1. Introduction

      Multiple sclerosis (MS) is a chronic autoimmune disease of the central nervous system (CNS) that results in inflammation, demyelination and neurodegeneration.
      • Filippi M
      • Bar-Or A
      • Piehl F
      • et al.
      Multiple sclerosis.
      Fatigue is one of the most prevalent and burdensome symptoms of MS, reported by >80% of patients, from within a year of disease onset, and throughout the disease course.
      • Kister I
      • Bacon TE
      • Chamot E
      • et al.
      Natural history of multiple sclerosis symptoms.
      Higher fatigue levels have been observed in progressive disease compared with relapsing-remitting forms.
      • Mills RJ
      • Young CA.
      The relationship between fatigue and other clinical features of multiple sclerosis.
      For relapsing-remitting forms, worsening fatigue was one of the main symptoms associated with relapses.
      • Nickerson M
      • Cofield SS
      • Tyry T
      • Salter AR
      • Cutter GR
      • Marrie RA.
      Impact of multiple sclerosis relapse: The NARCOMS participant perspective.
      Given its subjective nature, fatigue is assessed based on self-report, with numerous validated questionnaires used in MS clinical trials and clinical practice, e.g. Modified Fatigue Impact Scale (MFIS) and the Fatigue Severity Scale (FSS).
      • Elbers RG
      • Rietberg MB
      • van Wegen EE
      • et al.
      Self-report fatigue questionnaires in multiple sclerosis, Parkinson's disease and stroke: a systematic review of measurement properties.
      • Beckerman H
      • Eijssen IC
      • van Meeteren J
      • MC Verhulsdonck
      • de Groot V.
      Fatigue profiles in patients with multiple sclerosis are based on severity of fatigue and not on dimensions of fatigue.
      • Nordin A
      • Taft C
      • Lundgren-Nilsson A
      • Dencker A.
      Minimal important differences for fatigue patient reported outcome measures-a systematic review.
      These measures, however, exhibit varied robustness in fulfilling requirements for content validity and measurement properties for patient-reported outcome (PRO) measure used in drug development as a basis for endpoints and labelling claims.
      • Elbers RG
      • Rietberg MB
      • van Wegen EE
      • et al.
      Self-report fatigue questionnaires in multiple sclerosis, Parkinson's disease and stroke: a systematic review of measurement properties.
      • Beckerman H
      • Eijssen IC
      • van Meeteren J
      • MC Verhulsdonck
      • de Groot V.
      Fatigue profiles in patients with multiple sclerosis are based on severity of fatigue and not on dimensions of fatigue.
      • Nordin A
      • Taft C
      • Lundgren-Nilsson A
      • Dencker A.
      Minimal important differences for fatigue patient reported outcome measures-a systematic review.
      US Food and Drug Administration
      Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims.
      US Food and Drug Administration
      Methods to Identify What is Important to Patients & Select.
      • Penner IK
      • Paul F.
      Fatigue as a symptom or comorbidity of neurological diseases.
      The PROMIS Fatigue (MS) 8a, based on modern test methods and derived from the PROMIS item bank, is among the most promising instruments recently developed. Specifically, this measure assesses fatigue severity, conceptualized as comprising the experience (i.e. intensity and frequency) and impact of fatigue. Its intended context includes adults with progressive or relapsing forms of MS, with all levels of MS disability.
      During the PROMIS Fatigue (MS) 8a's development, patients with MS (n = 44) and clinicians (n = 36) were directly involved in selection of items most relevant to patients with MS from the PROMIS fatigue item bank. Subsequently, robust psychometric performance (i.e. internal consistency, known-groups validity, and convergence validity), was demonstrated for the measure in a cross-sectional study (n = 231). However, other psychometric properties such as responsiveness were not evaluated in the initial validation work. This research sought to generate additional evidence regarding the validity and applicability of the PROMIS Fatigue (MS) 8a, specifically:
      • to evaluate the relevance, comprehensiveness, and comprehensibility of PROMIS Fatigue (MS) 8a using cognitive debriefing interviews;
      • to assess construct validity, test–retest reliability, and responsiveness of PROMIS Fatigue (MS) 8a scores across UK and US MS populations;
      • to establish minimal important difference (MID) estimates to aide interpretation of meaningful within-person score change.

      2. Methods

      2.1 Study design

      This research followed a mixed-methods sequential design. Cognitive debriefing interviews were performed with MS patients to assess the relevance, comprehensiveness, and comprehensibility of the PROMIS Fatigue (MS) 8a. Interviews were conducted by a trained interviewer and based on structured topic guide (eMETHODS Online Supplement; Content Validity). Subsequently, two observational studies were performed, evaluating the psychometric properties of the PROMIS Fatigue (MS) 8a: a cross-sectional study at two MS centers in the US [US-UW study], and a longitudinal study among members of the UK MS Register in the UK [UK-MSR study]. Ethical approvals were obtained for all studies [RTI IRB #14206, South West Central Bristol National Research Ethics Service 16/SW/0194, Bristol (UK); Western IRB #20182214, Seattle (WA)]. All study participants gave informed consent prior to participation.

      2.2 Participants and procedures

      Study participants in the observational studies [UK-MSR; US-UW] had a clinician-confirmed MS diagnosis, were 18–65 years old, able to use a computer or tablet, and able to read and write in English. Exclusion criteria included: experiencing cognitive or other impairment (e.g. visual) that would interfere with questionnaire completion and using wheelchair or scooter as main form of mobility. In addition, patients with patient-reported web-based EDSS (PR-WebEDSS) >6.5 and MS phenotypes other than relapsing remitting, primary progressive, and secondary progressive were excluded.
      The UK MS register is among the largest MS registries in Europe.
      • Ford DV
      • Jones KH
      • Middleton RM
      • et al.
      The feasibility of collecting information from people with Multiple Sclerosis for the UK MS Register via a web portal: characterising a cohort of people with MS.
      Data are collected via a web-portal, with the possibility of linkage with healthcare records provided by participating NHS neurology clinics.
      • Ford DV
      • Jones KH
      • Middleton RM
      • et al.
      The feasibility of collecting information from people with Multiple Sclerosis for the UK MS Register via a web portal: characterising a cohort of people with MS.
      Members are recruited through the registry's portal or the 48 participating NHS neurology centers across the UK. All registry members were sent an email invitation to join the study. In addition to routine assessments in the registry (every 6 months), study participants completed various PROs (described below) over 96-weeks (i.e. baseline, weeks 1, 24, 52, 72, and 96). Data were collected from September 2018 to October 2020; the current analyses are based on baseline through to week 52 data.
      The US-UW study was conducted at two MS centers, the MS Center at University of Washington Medical Center – Northwest and the Swedish Neuroscience Institute, in Seattle, Washington (US). Patients were invited to join the study via post (if they had an active registration at one of the two centers) or during their routine clinic attendance. Prospective data collection took place from July 2019 to January 2020 in the clinic using an iPad tablet computer.

      2.3 Outcome measures

      Participants completed the PROMIS Fatigue (MS) 8a,
      • Cook KF
      • Bamer AM
      • Roddey TS
      • Kraft GH
      • Kim J
      • Amtmann D
      A PROMIS fatigue short form for use by individuals who have multiple sclerosis.
      PR-WebEDSS,
      • Leddy S
      • Hadavi S
      • McCarren A
      • Giovannoni G
      • Dobson R.
      Validating a novel web-based method to capture disease progression outcomes in multiple sclerosis.
      PROMIS v1.2 – Global Health (GHS),
      • Hays RD
      • Bjorner JB
      • Revicki DA
      • Spritzer KL
      • Cella D
      Development of physical and mental health summary scores from the patient-reported outcomes measurement information system (PROMIS) global items.
      Multiple Sclerosis Impact Scale (MSIS-29),
      • Hobart J
      • Lamping D
      • Fitzpatrick R
      • Riazi A
      • Thompson A.
      The Multiple Sclerosis Impact Scale (MSIS-29): a new patient-based outcome measure.
      Functional Assessment of Multiple Sclerosis (FAMS),
      • Cella DF
      • Dineen K
      • Arnason B
      • et al.
      Validation of the functional assessment of multiple sclerosis quality of life instrument.
      Multiple Sclerosis Walking Scale (MSWS)-12,
      • Hobart JC
      • Riazi A
      • Lamping DL
      • Fitzpatrick R
      • Thompson AJ.
      Measuring the impact of MS on walking ability: the 12-Item MS Walking Scale (MSWS-12).
      Fatigue Severity Scale (FSS),
      • Mills R
      • Young C
      • Nicholas R
      • Pallant J
      • Tennant A.
      Rasch analysis of the Fatigue Severity Scale in multiple sclerosis.
      Modified Fatigue Impact Scale (MFIS),
      • Learmonth YC
      • Dlugonski D
      • Pilutti LA
      • Sandroff BM
      • Klaren R
      • Motl RW.
      Psychometric properties of the Fatigue Severity Scale and the Modified Fatigue Impact Scale.
      Patient Global Rating of Change (PGRC) – Fatigue,
      • Jaeschke R
      • Singer J
      • Guyatt GH.
      Measurement of health status. Ascertaining the minimal clinically important difference.
      EuroQoL-5D-3L,
      • Gusi N
      • Olivares PR
      • Rajendram R.
      The EQ-5D Health-Related Quality of Life Questionnaire.
      Hospital Anxiety and Depression Scale
      • Johnston M
      • Pollard B
      • Hennessey P.
      Construct validation of the hospital anxiety and depression scale with clinical populations.
      and Patient Health Questionnaire-8 (PHQ).
      • Kroenke K
      • Strine TW
      • Spitzer RL
      • Williams JB
      • Berry JT
      • Mokdad AH.
      The PHQ-8 as a measure of current depression in the general population.
      Further data on clinical characteristics such as EDSS, MS phenotype and treatment history were extracted from the patient's records, retrospectively.
      The PROMIS-Fatigue (MS) 8a is scored on a T-score metric, with a mean of 50 and a standard deviation (SD) of 10; higher scores indicate higher fatigue. The T-score metric is referenced to the US general population, e.g. a T-score of 40 would be one SD below the US general population. Scores are calculated based on item response theory (i.e., graded response model), using individuals’ item responses. The scoring can be performed using the Assessment Centre Application Programming Interface (AC-API), which facilitates calculation of PROMIS T scores for each participant, based on response pattern scoring (https://www.healthmeasures.net/score-and-interpret/calculate-scores/scoring-instructions). Alternatively, T-scores for the PROMIS Fatigue (MS) 8a can be calculated based on raw sum scores using a raw-sum-score to T-score crosswalk table. However, this approach may not be as accurate as response pattern scoring using the API. (eMETHODS, Online Supplement).

      2.4 Statistical analysis

      The psychometric properties of the PROMIS Fatigue (MS) 8a were evaluated including unidimensionality, reliability, validity, responsiveness, and MID estimates. A tool was developed to aid score interpretation (T-score map). Software used included STATA v15.1

      StataCorp LLC. Stata statistical software: Release 15: (2017).

      Software MPLUS v8.2,
      • Muthén LK
      • Muthén BO.
      Mplus User's Guide..
      and R v3.33.

      R Core Team. R: A Language and Environment for Statistical Computing: (2018).

      Analyses were performed separately for the UK and US studies.
      Exploratory factor analysis and a one-factor model confirmatory factor analysis were performed to assess unidimensionality of the PROMIS Fatigue (MS) 8a. The availability of two study samples allowed the more rigorous approach of exploring the factor structure with one sample and confirming in a second.
      The proportions of the sample with highest/lowest responses across all items were calculated to evaluate ceiling/floor effects. A proportion of >0.15 was judged to be a problematic ceiling or floor effect.
      • Terwee CB
      • Bot SD
      • de Boer MR
      • et al.
      Quality criteria were proposed for measurement properties of health status questionnaires.
      Cronbach's coefficients were calculated to assess the PROMIS Fatigue (MS) 8a's internal consistency. Intra-class correlation coefficient (ICC) (mixed effect's model for absolute agreement/model 3 type 1)
      • Coons SJ
      • Gwaltney CJ
      • Hays RD
      • et al.
      Recommendations on evidence needed to support measurement equivalence between electronic and paper-based patient-reported outcome (PRO) measures: ISPOR ePRO Good Research Practices Task Force report.
      was calculated between baseline and follow-up scores (5–27 days) to assess test-retest reliability. Only patients with no change in GHS fatigue global responses were included in this analysis. Reliability of ICC ≥0.7 is considered adequate for aggregated/group analyses.
      • Reeve BB
      • Wyrwich KW
      • Wu AW
      • et al.
      ISOQOL recommends minimum standards for patient-reported outcome measures used in patient-centered outcomes and comparative effectiveness research.
      Spearman's rho was estimated between scores on the PROMIS Fatigue (MS) 8a and those of related PRO measures to assess convergence validity; a rho of >0.4 supports convergence validity.
      • Fayers PM
      • Machin D.
      Quality of Life: The Assessment, Analysis and Interpretation of Patient-reported Outcomes.
      ,
      • Prinsen CAC
      • Mokkink LB
      • Bouter LM
      • et al.
      COSMIN guideline for systematic reviews of patient-reported outcome measures.
      A strong correlation (rho >0.6) was expected with other fatigue measures; a moderate correlation (rho 0.4–0.6) was expected with scores on measures of more distal but related concepts (e.g. GHS GPH Summary Score, MSWS-12).
      To assess known-groups validity, the following hypotheses related to differences in PROMIS Fatigue (MS) 8a scores across distinct patient groups were examined based on ANOVA:
      • Participants with lower fatigue will report better PROMIS MS Fatigue scores than participants with worse fatigue.
      • GHS global08 none/mild/moderate versus severe/very severe
      • FSS score of <36 versus FSS ≥36
      • Participants with better health status will report better PROMIS MS Fatigue scores than participants with worse health.
      • GHS global01 excellent/very good/good versus fair/poor
      • GHS global03 physical health excellent/very good/good versus fair/poor
      • GHS GPH summary score of <50 versus ≥50
      • Participants with lower disability will report better PROMIS MS Fatigue scores than participants with higher disability.
      • EDSS of ≤4 versus 4.5–6.5
      • PR-WebEDSS of ≤4 versus 4.5–6.5
      • Participants with higher mobility (lower extremity function (LEF)) will report better PROMIS MS Fatigue scores than participants with worse LEF.
      • MSWS-12 of <25 versus 25 to <50 versus ≥50
      • FAMS mobility of ≤15 versus 16 to 22 versus >22
      • EQ-5D-3L mobility no problems versus some problems
      • Participants with progressive disease will report worse PROMIS MS Fatigue scores than those with relapsing disease
      • RRMS versus SPMS versus PPMS
      • Participants with one or more relapses in the past 6 months will report worse PROMIS MS Fatigue scores than participants with no relapse
      Score changes from baseline to weeks 24 and 52, were examined across patient groups experiencing different levels of change in fatigue or functional status (i.e. on various anchors), to assess responsiveness [UK-MSR only]. Patients were classified as improving/worsening based on baseline to week 24, and week 52 score changes (∆) on:
      • GHS fatigue question [≥1-point decrease; ≥1-point increase];
      • GHS GPH summary score [≥5-point decrease; ≥5-point increase];
      • FSS score [≥4.5-point increase; ≥4.5-point decrease].
      The appropriateness of the selected anchors was assessed based on multiple criteria:
      • Coon CD
      • Cook KF.
      Moving from significance to real-world meaning: methods for interpreting change in clinical outcome assessment scores.
      ,
      • Yost KJ
      • Eton DT
      • Garcia SF
      • Cella D
      Minimally important differences were estimated for six patient-reported outcomes measurement information system-cancer scales in advanced-stage cancer patients.
      • measuring fatigue or related concepts as the PROMIS-Fatigue (MS);
      • a spearman's rho >0.3 with ∆ PROMIS-Fatigue (MS);
      • at least 10 observations in each change groups (i.e. worsening, minimal worsening, improving, minimal improvement).
      Within-group score changes were tested using paired T-tests. Standard response mean and Cohen's d effect size (ES) were calculated for each group. ES is interpreted as: small, ES=0.2; moderate, ES=0.5; and large, ES=0.8.
      • Cohen J.
      Between-group comparisons in score changes (i.e. worsening vs unchanged; unchanged vs improving) were performed using analysis of covariance (ANCOVA), controlling for baseline score.
      Anchor-based approaches were applied to establish MID estimates for PROMIS Fatigue (MS) 8a score changes from baseline to week 24/week 52; these were supported by distribution-based metrics (eMETHODS). Anchors for minimal improvement and minimal worsening included baseline to week 24/week 52 score change on:
      • GHS fatigue question [1-point decrease; 1-point increase];
      • FSS score [4.5–9.9-point increase; 4.5–9.9-point decrease];
      • GHS GPH summary score [5-point decrease; 5-point increase].
      To facilitate score interpretation, a T-score map was created that categorized PROMIS Fatigue (MS) 8a T-scores. Results are portrayed as a heatmap showing the most likely response for each item, for each T-score (Supporting Figure 1).

      3. Results

      3.1 Content validity

      Cognitive debriefing interview participants (n=29) had a mean age of 44.2 (26–67) years, and 83% were female. Other key sociodemographic and clinical characteristics are presented in Table 1.
      Table 1Key sociodemographics and clinical characteristics of study participants.
      Cognitive debriefing interviewsUK MS register
      Analysis sample includes respondents with EDSS ≤6.5, Age ≤65 years, with PPMS, RRMS, SPMS phenotypes
      University of Washington /MS clinics
      Analysis sample includes respondents with EDSS ≤6.5, Age ≤65 years, with PPMS, RRMS, SPMS phenotypes
      Baseline (n=29)Baseline

      (n=384)
      Week 52 follow-up

      (n=311)
      Baseline characteristics of participants with a week 52 follow-up assessment
      Baseline

      (n=296)
      Age
      Mean (SD)45.549.9 (9.8)50.7 (9.4)44.50 (11.2)
      Median515243.5
      Range26–6722–6522– 6521.05– 65.6
      Gender, n (%)
      Male5 (17.2)91 (23.7)75 (24.1)75 (25.3)
      Female24 (82.8)293 (76.3)236 (75.9)219 (74.0)
      Non-binary2 (0.7)
      Ethnicity, n (%)
      “Other” includes Don't know or not sure and American Indian /American Native for the US-UW study.
      White26 (89.7)235 (79.4)
      African American/Black1 (3.4)14 (4.7)
      Hispanic2 (6.9)18 (6.1)
      Mixed15 (5.1)
      Asian9 (3.0)
      Other5 (1.7)
      Time since MS diagnosis, years
      Mean (SD)5.210.22 (7.96)10.45 (8.30)9.65(7.51)
      Mediann/a8.0088.22
      Range0.3–110–380–380.12–37.7
      MS phenotype
      Relapse-remitting16260 (67.7)204 (65.6)280 (94.6)
      Secondary progressiven/a85 (22.1)74 (23.8)9 (3.0)
      Primary progressive1339 (10.2)33 (10.6)7 (2.4)
      EDSS (n=258)
      EDSS score was extracted in the US-UW only, based on most recent assessment within the last 2 years. Data were available for n=258. EDSS: Expanded Disability Status Scale; MS: multiple sclerosis; PPMS: primary progressive MS; PR-WebEDSS: patient-reported web-based EDSS; RRMS: relapsing remitting MS; SD: standard deviation; SPMS: secondary progressive MS.
      Mean (SD)2.62 (1.7)
      Median2
      Min-max0–6.5
      Mild (0–4.0)221 (85.7)
      Moderate (>4–6.5)37 (14.3)
      PR-WebEDSS
      Mean (SD)4.59 (1.89)4.67 (2.0)3.41 (1.7)
      Median55.53.5
      Min-Max0–6.50–6.50–6.5
      Mild (0–4.0)168 (43.75)129 (41.5)202 (68.2)
      Moderate (>4–6.5)216 (56.25)182 (58.5)94 (31.8)
      Global MS severity
      Mild8 (27.6)
      Moderate16 (55.2)
      Severe5 (17.2)
      a Analysis sample includes respondents with EDSS ≤6.5, Age ≤65 years, with PPMS, RRMS, SPMS phenotypes
      b Baseline characteristics of participants with a week 52 follow-up assessment
      c “Other” includes Don't know or not sure and American Indian /American Native for the US-UW study.
      d EDSS score was extracted in the US-UW only, based on most recent assessment within the last 2 years. Data were available for n=258.EDSS: Expanded Disability Status Scale; MS: multiple sclerosis; PPMS: primary progressive MS; PR-WebEDSS: patient-reported web-based EDSS; RRMS: relapsing remitting MS; SD: standard deviation; SPMS: secondary progressive MS.
      A summary of findings from the interviews is presented in Supporting Table 2.
      Table 2One-factor CFA of the PROMIS Fatigue (MS) 8a – standardized loadings
      Loadings based on pattern matrix.
      and goodness of fit statistics: baseline.
      UK-MSR (n=384)US-UW (n=296)
      Estimate
      Estimator = WLSMV, ROTATION = OBLIMIN
      SEEstimate
      Estimator = WLSMV, ROTATION = OBLIMIN
      SE
      1. How often were you too tired to think clearly?0.821
      p<0.001.
      0.0170.828
      p<0.001.
      0.019
      2. How often were you too tired to enjoy life?0.874
      p<0.001.
      0.0120.898
      p<0.001.
      0.014
      3. How often did you find yourself getting tired easily?0.91
      p<0.001.
      0.0110.926
      p<0.001.
      0.009
      4. How often did you feel tired even when you hadn't done anything?0.884
      p<0.001.
      0.0120.908
      p<0.001.
      0.01
      5. How often did you have trouble finishing things because of your fatigue?0.935
      p<0.001.
      0.0070.946
      p<0.001.
      0.008
      6. How often did you have to push yourself to get things done because of your fatigue?0.876
      p<0.001.
      0.0130.939
      p<0.001.
      0.009
      7. How often did your fatigue interfere with your social activities?0.907
      p<0.001.
      0.0100.893
      p<0.001.
      0.019
      8. To what degree did your fatigue interfere with your physical functioning?0.898
      p<0.001.
      0.0110.919
      p<0.001.
      0.01
      Goodness of fit statistics
      Optimal fit was determined based on: 1) Root mean square error of approximation <0.08, 2) Root mean square residual (RMSR) <0.05, 3) comparative fit index <0.95, 4) Tucker Lewis Index <0.95, 5) item parameters (loadings) and magnitude of the residuals. CFA: confirmatory factor analysis; CFI: comparative fit index; CI: confidence interval; PROMIS: Patient-Reported Outcomes Measurement Information System; RMSEA: root mean square error of approximation; SE: standard error; SRMR: standardized root mean square residual; TLI: Tucker-Lewis index; WLSMV: weighted least squares means and variance. *p<0.05, **p<0.01,
      RMSEA (95% CI)0.105 (0.086–0.125)0.13 (0.11–0.152)
      CFI0.9950.995
      TLI0.9930.993
      SRMR0.0160.026
      a Loadings based on pattern matrix.
      b Estimator = WLSMV, ROTATION = OBLIMIN
      c Optimal fit was determined based on: 1) Root mean square error of approximation <0.08, 2) Root mean square residual (RMSR) <0.05, 3) comparative fit index <0.95, 4) Tucker Lewis Index <0.95, 5) item parameters (loadings) and magnitude of the residuals.CFA: confirmatory factor analysis; CFI: comparative fit index; CI: confidence interval; PROMIS: Patient-Reported Outcomes Measurement Information System; RMSEA: root mean square error of approximation; SE: standard error; SRMR: standardized root mean square residual; TLI: Tucker-Lewis index; WLSMV: weighted least squares means and variance.*p<0.05,**p<0.01,
      low asterisklow asterisklow asterisk p<0.001.
      The PROMIS Fatigue (MS) 8a was judged to be comprehensive in covering key aspects of MS fatigue experience for most participants (23 of 29). A few participants suggested addition of content related to “need to rest, recover, nap” (n=3), “impacts on mood/emotions or family life” (n=1), “time of day when fatigue is worst” (n=1). The item statements, the 5-point Likert response options, and the recall period “in the past 7 days” were easily understood and judged to be appropriate by the participants. No modifications were made to the PROMIS Fatigue (MS) 8a.

      3.2 Psychometric properties

      Of those enrolled, 384 [UK-MSR] and 296 [US-UW] patients were included in the analysis (Supporting Figure 2). The mean age of UK-MSR and US-UW participants was 49.9 (SD=9.8) and 44.5 (SD=11.2), respectively. The majority had relapsing-remitting MS type, i.e. 67.7% [UK-MSR] and 94.6% [US-UW].

      3.3 Unidimensionality

      A one-factor structure emerged as the most optimal solution for the PROMIS Fatigue (MS) 8a, from the exploratory factor analysis (Supporting Table 3). Further, a one-factor confirmatory factor analysis model showed good fit; with all goodness of fit statistics, except root mean square error of approximation, within recommended ranges in both samples (Table 2). Standardized loadings of 0.78–0.9 [UK-MSR] and 0.82–0.95 [US-UW] were obtained. These findings support essential unidimensionality of the PROMIS Fatigue (MS) 8a, suggesting use of a single total score to characterize and measure fatigue in MS using this measure is appropriate.
      Table 3Known-groups validity of the PROMIS Fatigue (MS) 8a – score differences across clinically relevant subgroups, using t-test/ANOVA: baseline.
      UK-MS registerUS-UW study
      nMean (SE)F-statistic (df); p-valuenMean (SE)F-statistic (df); p-value
      EDSS score25814.90 (255); <0.001
      0–4.022155.84 (0.73)
      4.5–6.53762.88 (1.05)
      PR-WebEDSS score38418.23 (383); <0.00129660.97 (295); <0.001
      0–4.016853.73 (0.72)20254.46 (0.74)
      4.5–6.521662.9 (0.49)9463.78 (0.76)
      GHS fatigue question (global08r)361173.73 (360); <0.001296153.58 (295); <0.001
      Severe/very severe (1,2)9967.6 (0.5)7468.05 (0.73)
      None/mild/moderate (3,4,5)26255.63 (0.52)22253.88 (0.62)
      FSS scores (range: 9–63)374299.56 (273); <0.001

      <368947.69 (0.87)
      ≥3628562.44 (0.39)
      GHS general health question (global01)36198.27 (358); <0.00129637.11 (295); <0.001
      Fair/poor (1,2)14564.14 (0.51)5964.45 (1.13)
      Excellent/very good/good (3,4,5)21655.4 (0.65)23755.67 (0.66)
      GHS physical health question (global03)361148.06 (360); <0.00129466.35 (293); <0.001
      Fair/poor (1,2)19163.68 (0.48)9663.85 (0.93)
      Excellent/very good/good (3,4,5)17053.55 (0.7)19854.23 (0.69)
      GHS GPH summary score361191.82 (360); <0.001294295.77 (293); <0.001
      >506046.3 (1.08)9946.93 (0.8)
      ≤5030161.4 (0.42)19562.67 (0.51)
      MSWS-12 total score

      (range: 0–100)
      28348.23 (282); <0.001

      <2511652.72 (0.87)
      25 to <505356.78 (1.1)
      ≥5011463.44 (0.68)
      FAMS mobility score

      296107.51 (295); <0.001
      ≤158966.04 (0.76)
      16 to 227759.51 (0.86)
      >2212950.16 (0.78)
      EQ-5D-3L mobility domain score37780.17 (376); <0.001

      No problems (0)11152.8 (0.92)
      Some problems (1)26661.43 (0.49)
      MS phenotype
      For US-WS study, PPMS and SPMS were combined. df: degrees of freedom; EDSS: Expanded Disability Status Scale; FAMS: Functional Assessment of Multiple Sclerosis; FSS: fatigue severity scale; GHS: Global Health Status; MS: multiple sclerosis; MSWS-12: 12-item Multiple Sclerosis Walking Scale; PPMS: primary progressive MS; PROMIS: Patient-Reported Outcomes Measurement Information System; PRWebEDSS: patient-reported web-based EDSS; RRMS: relapsing remitting MS; SE: standard error; SPMS: secondary progressive MS.
      38410.66 (383); <0.0012960.06 (295);

      0.805
      RRMS(1)26057.41 (0.61)28057.38 (0.62)
      PPMS(2)3961.19 (1.16)1658.05 (3.06)
      SPMS(3)8562.36 (0.85)
      Relapse within the last 6 months

      26418.04 (263); <0.001
      No20555.47 (0.75)
      Yes5961.94 (1.12)
      a For US-WS study, PPMS and SPMS were combined.df: degrees of freedom; EDSS: Expanded Disability Status Scale; FAMS: Functional Assessment of Multiple Sclerosis; FSS: fatigue severity scale; GHS: Global Health Status; MS: multiple sclerosis; MSWS-12: 12-item Multiple Sclerosis Walking Scale; PPMS: primary progressive MS; PROMIS: Patient-Reported Outcomes Measurement Information System; PRWebEDSS: patient-reported web-based EDSS; RRMS: relapsing remitting MS; SE: standard error; SPMS: secondary progressive MS.

      3.4 Score distribution and reliability

      The mean PROMIS Fatigue (MS) 8a T-score was 58.9 (SD=9.4) [UK-MSR] and 57.7 (SD=10.5) [US-UW]. The percentage of participants with the lowest and highest scores for all items (i.e. floor and ceiling) in the two samples were 3.4% and 1.3% [UK-MSR], 5.4% and 1.4% [US-UW]; this was well below the critical value of <15%. In both the UK-MSR and US-UW samples, Cronbach's alpha was 0.96 and 5- to 27-day retest ICC was ≥0.9 (Supporting Figure 3). These results indicate strong internal consistency and score reproducibility for the PROMIS Fatigue (MS) 8a scores.

      3.5 Construct validity

      Results for known-groups validity analyses are presented in Table 3. The PROMIS Fatigue (MS) 8a score showed significant differences in scores in expected ranges (ANCOVA Test, p<0.01 for all but the MS phenotype tests).
      Spearman's rho correlations between scores on PROMIS Fatigue (MS) 8a and scores on other fatigue measures (GHS fatigue question, FSS, MFIS-total, MFIS-physical, MFIS-cognitive, MFIS-psychosocial subscales) ranged from 0.78 through 0.82. The range was 0.4–0.65 between scores on PROMIS Fatigue (MS) 8a and scores on other related PROs (EQ-5D-3L mobility domain, functional assessment of multiple sclerosis mobility, MSWS-12, PR-WebEDSS score, GHS general health question, GHS physical health question, GHS GMH summary score, GHS GPH summary score, MSIS-29 physical health domain, MSIS-29 psychosocial impact domain) (Table 4).
      Table 4Convergence validity of the PROMIS Fatigue (MS) 8a: Spearman's rank sum correlations
      Correlation coefficients for MFIS, PHQ-8, MSIS-29, FAMS mobility, PROMIS GHS Summary Scores for the US-UW study were based on Pearson's correlation.
      with related PRO measures: baseline and week 52.
      UK MS RegisterUS-UW
      BaselineWeek 52Baseline
      nrhonrhonrho
      Fatigue Severity Scale3740.78

      2650.80N/A
      Modified Fatigue Impact Scale - Total ScoreN/A

      2960.87
      Modified Fatigue Impact Scale - Physical SubscoreN/A

      2960.86
      Modified Fatigue Impact Scale - Cognitive SubscoreN/A

      2960.75
      Modified Fatigue Impact Scale - Psychosocial SubscoreN/A

      2960.84
      GHS Fatigue question (global08r)361-0.82287-0.81296-0.83
      GHS GPH Summary Score361-0.78287-0.71294-0.81
      GHS GMH Summary ScoreN/A292-0.67
      MSIS-29 Physical Impact Score3760.712660.612960.75
      MSIS-29 Psychological Impact ScoreN/A2960.70
      EDSS score
      EDSS was retrospectively collected based on assessment within the previous 2 years (for US-UW only) N/A, not examined in the given study
      N/A2580.38
      PR-WebEDSS score3610.532170.462960.57
      GHS health question (global01)359-0.57287-0.55296-0.57
      GHS physical health question (global03)361-0.65287-0.56294-0.57
      Multiple Sclerosis Walking Scale (MSWS)-122830.571670.60N/A
      FAMS-MobilityN/A295-0.69
      EQ-5D-3L mobility domain3770.402720.45N/A
      PHQ-8 total scoreN/A2960.71
      Notes:
      a Correlation coefficients for MFIS, PHQ-8, MSIS-29, FAMS mobility, PROMIS GHS Summary Scores for the US-UW study were based on Pearson's correlation.
      b EDSS was retrospectively collected based on assessment within the previous 2 years (for US-UW only)N/A, not examined in the given study

      3.6 Responsiveness

      Responsiveness was analyzed in the UK-MSR sample only (Table 5). Expected changes in PROMIS Fatigue (MS) 8a scores were observed in both the worsening group [∆ T-score = 3.97; Cohen's ES=0.34] and improving group [∆ T-score = -3.83; Cohen's ES=0.34], at week 52 (criterion for change was ≥1 point). A similar pattern of score changes was observed for other anchors (i.e. FSS and GHS GPH summary scores), and for baseline to week 24 changes.
      Table 5Responsiveness of the PROMIS Fatigue (MS) – analysis of score change across response group: baseline to Week 24, and Week 52.
      PROMIS Fatigue (MS) 8a T-Scores
      Anchor∆ (Baseline – Week 24)∆ (Baseline – Week 52)
      ImprovingUnchangedWorseningImprovingUnchangedWorsening
      Fatigue PGRC
      Fatigue PGRC response groups: improving = A little better/moderately better/very much better, unchanged = no change, worsening = A little worse/moderately worse/very much worse.
      n261051862693188
      Mean change (SD)3.42 (5.35)0.55 (5.31)-1.09 (4.17)2.61 (5.31)1.35 (4.74)-0.76 (5.25)
      T-test statistic; p-value3.26; 0.0031.06; 0.292-3.57; <0.0012.50; 0.022.75; 0.01-2.00; 0.05
      Effect size (est, 95% CI)0.41 (-0.14 to 0.96)0.06 (-0.22 to 0.33)-0.15 (-0.35 to 0.06)0.28 (-0.27 to 0.82)0.14 (-0.15 to 0.43)-0.11 (-0.32 to 0.09)
      SRM0.640.1-0.260.490.29-0.15
      ANCOVA, statistic; p-value
      Stable vs worsened18.04; <0.00152.36; <0.001
      Stable vs improved5.56; 0.0018.59; <0.001
      GHS fatigue question (global08r)
      GHS fatigue global question (global08r) response groups: improving = ≥1-point increase, unchanged = 0-point change, worsening = ≥1-point decrease.
      n45195615217065
      Mean change (SD)4.52 (5.41)-0.14 (3.98)-3.99 (4.22)3.97 (4.06)0.48 (4.52)-3.83 (5.52)
      T-test statistic; p-value5.60; <0.001-0.49; 0.627-7.38; <0.0017.06; <0.0011.38; 0.17-5.60; <0.001
      Effect size (est, 95% CI)0.61 (0.18 to 1.03)-0.01 (-0.21 to 0.18)-0.42 (-0.78 to -0.06)0.50 (0.11 to 0.89)0.05 (-0.16 to 0.27)-0.44 (-0.78 to -0.09)
      SRM0.83-0.03-0.950.980.11-0.69
      ANCOVA, statistic; p-value
      Stable vs worsened24.22; <0.00146.38; <0.001
      Stable vs improved21.71; <0.00123.08; <0.001
      GHS GPH Summary Score
      GHS PHC summary score response groups: improving = ≥5-point increase, unchanged = decease/increase <5 points, worsening = ≥5-point decrease.
      n37218463619754
      Mean change (SD)3.27 (4.99)-0.02 (4.49)-4.01 (4.53)3.64 (4.51)0.37 (4.80)-3.06 (5.86)
      T-test statistic; p-value3.99; <0.001-0.05; 0.957-6.01; <0.0014.84; <0.0011.09; 0.28-3.83; <0.001
      Effect size (est, 95% CI)0.34 (-0.12 to 0.79)0.00 (-0.19 to 0.19)-0.38 (-0.79 to 0.04)0.38 (-0.08 to 0.85)0.04 (-0.16 to 0.24)-0.35 (-0.73 to 0.03)
      SRM0.660-0.890.810.08-0.52
      ANCOVA, statistic; p-value
      Stable vs worsened17.11; <0.00134.81; <0.001
      Stable vs improved9.52; <0.00130.74; <0.001
      FSS scores
      FSS response groups: improving = ≥4.5-point increase, unchanged = <4.5-point decrease/increase, worsening = ≥ 4.5-point decrease. Cohen's d small = 0.2, medium = 0.5, large = 0.8; if d=1, there is one standard deviation difference between the groups. CI: confidence interval; FSS: Fatigue Severity Scale; GHS: Global Health Status; GPH: Global Physical Health; MS: multiple sclerosis; PGRC: patient global rating of change; PROMIS: Patient-Reported Outcomes Measurement Information System; SD: standard deviation; SRM, standard response mean.
      N67159675814166
      Mean change (SD)2.41 (5.76)-0.33 (4.07)-1.93 (4.14)2.86 (4.75)-0.07 (4.72)-1.73 (5.96)
      T-test statistic; p-value-3.81; <0.001-1.01; 0.3163.42; 0.0014.59.; <0.001-0.17; 0.87-2.35; 0.02
      Effect size (est, 95% CI)0.30 (-0.04 to 0.64)-0.03 (-0.25 to 0.19)-0.23 (-0.57 to 0.11)0.34 (-0.03 to 0.70)-0.01 (-0.24 to 0.23)-0.22 (-0.57 to 0.12)
      SRM0.42-0.08-0.470.6-0.01-0.29
      ANCOVA, statistic; p-value
      Stable vs worsened5.45; 0.00520.31; <0.001
      Stable vs improved11.70; <0.00119.28; <0.001
      a Fatigue PGRC response groups: improving = A little better/moderately better/very much better, unchanged = no change, worsening = A little worse/moderately worse/very much worse.
      b GHS fatigue global question (global08r) response groups: improving = ≥1-point increase, unchanged = 0-point change, worsening = ≥1-point decrease.
      c GHS PHC summary score response groups: improving = ≥5-point increase, unchanged = decease/increase <5 points, worsening = ≥5-point decrease.
      d FSS response groups: improving = ≥4.5-point increase, unchanged = <4.5-point decrease/increase, worsening = ≥ 4.5-point decrease.Cohen's d small = 0.2, medium = 0.5, large = 0.8; if d=1, there is one standard deviation difference between the groups.CI: confidence interval; FSS: Fatigue Severity Scale; GHS: Global Health Status; GPH: Global Physical Health; MS: multiple sclerosis; PGRC: patient global rating of change; PROMIS: Patient-Reported Outcomes Measurement Information System; SD: standard deviation; SRM, standard response mean.
      Between those worsening vs unchanged and those unchanged vs improving, PROMIS Fatigue (MS) 8a scores had expected and significant differences for all anchors (i.e. GHS fatigue question, FSS; and GHS GPH summary score) (ANOVA Test, p<0.01).

      3.7 Meaningful score interpretation

      Results for anchor-based MID estimates and distribution-based metrics for score interpretation are presented in Table 6. The baseline-week 52 mean ∆ PROMIS Fatigue (MS) 8a score was 3.86 (Cohen's ES=0.48) for patients with a 1-point increase (minimal improvement) in GHS fatigue question, and -3.37 (ES=-0.39) for those with a 1-point decrease (minimal worsening). The mean ∆ scores on the other anchors were, 3.06 to 3.46 for minimal improvement, and -1.17 to -2.24 for minimal worsening. Considering estimates >SEM, with Cohen's d=0.2–0.8, a change of 3.4–4 points is proposed as MID for improvement or worsening of PROMIS Fatigue (MS) 8a scores. eCDFs for the three anchors, showing the cumulative proportion of patients experiencing a given mean ∆ score at week 52 support the proposed thresholds (Supporting Figure 4).
      Table 6Interpretation of individual-level PROMIS Fatigue (MS) score change – MID estimates: baseline to Week 24, and baseline to Week 52.
      PROMIS Fatigue (MS) Score
      Anchor/Metric∆ (Week 24 – Baseline)∆ (Week 52 – Baseline)
      Minimal worseningMinimal improvementMinimal worseningMinimal improvement
      Fatigue PGRC
      Fatigue PGRC response groups: minimal improvement = A little better/moderately better, minimal worsening = A little worse/moderately worse.
      n1662417123
      Mean change (SD)-0.99 (4.17)3.25 (5.37)-0.78 (5.39)2.16 (4.99)
      Median-1.102.20-0.802.60
      Effect size (est, 95% CI)-0.14

      (-0.35 to 0.08)
      0.44

      (-0.14 to 1.01)
      -0.12

      (-0.33 to 0.09)
      0.22

      (-0.36 to 0.8)
      GHS fatigue question (global08r)
      GHS fatigue global question (global08r) response groups: minimal improvement = 1-point increase, worsening = 1-point decrease.
      n58416250
      Mean change (SD)-4.07 (4.25)3.58 (4.5)-3.37 (4.62)3.86 (4.09)
      Median-3.552.90-2.703.65
      Effect size (est, 95% CI)-0.43

      (-0.79 to -0.06)
      0.49

      (0.05 to 0.93)
      -0.39

      (-0.74 to -0.03)
      0.48

      (0.08 to 0.88)
      GHS GPH Summary score
      GHS GPH summary score response groups: minimal improvement = 4.4–9.4-point increase, minimal worsening = 4.4–9.4 point decrease.
      n48355141
      Mean change (SD)-3.9 (4.47)2.39 (4.18)-2.24 (4.82)3.06 (4.31)
      Median-3.702.70-2.203.30
      Effect size (est, 95%CI)-0.39

      (-0.79 to 0.02)
      0.23

      (-0.24 to 0.7)
      -0.26

      (-0.65 to 0.13)
      0.36

      (-0.08 to 0.79)
      FSS scores
      FSS response groups: minimal improvement = 4.5–9.9-point increase, minimal worsening = 4.5–9.9-point decrease. Cohen's d small = 0.2, medium = 0.5, large = 0.8; if d=1, there is one standard deviation difference between the groups. CI: confidence interval; FSS: Fatigue Severity Scale; GHS: Global Health Status; GPH: Global Physical Health; IRT, item response theory; MID: minimal important difference; MS: multiple sclerosis; PGRC, patient global rating of change; SD: standard deviation; SEM, standard error of measurement.
      n45443935
      Mean change (SD)-1.49 (4.15)1.35 (4.77)-1.17 (6.54)3.46 (4.53)
      Median-1.300.40-1.704.40
      Effect size (est, 95% CI)-0.17

      (-0.58 to 0.25)
      0.2

      (-0.22 to 0.62)
      -0.18

      (-0.63 to 0.26)
      0.4

      (-0.08 to 0.87)
      PROMIS Fatigue (MS) T-score
      BaselineWeek 52
      1/3 standard deviation3.142.93
      1/2 standard deviation4.714.39
      IRT SEM2.072.06
      "Traditional" SEM2.982.78
      a Fatigue PGRC response groups: minimal improvement = A little better/moderately better, minimal worsening = A little worse/moderately worse.
      b GHS fatigue global question (global08r) response groups: minimal improvement = 1-point increase, worsening = 1-point decrease.
      c GHS GPH summary score response groups: minimal improvement = 4.4–9.4-point increase, minimal worsening = 4.4–9.4 point decrease.
      d FSS response groups: minimal improvement = 4.5–9.9-point increase, minimal worsening = 4.5–9.9-point decrease.Cohen's d small = 0.2, medium = 0.5, large = 0.8; if d=1, there is one standard deviation difference between the groups.CI: confidence interval; FSS: Fatigue Severity Scale; GHS: Global Health Status; GPH: Global Physical Health; IRT, item response theory; MID: minimal important difference; MS: multiple sclerosis; PGRC, patient global rating of change; SD: standard deviation; SEM, standard error of measurement.
      The T-score map interpretative guide for PROMIS Fatigue (MS) 8a scores is presented in Supporting Figure 1. For each T-score, the map indicates the likely level of fatigue severity or impact on each individual item. For example, a patient with a T-score of 60 is most likely to report that they often get tired easily, are sometimes too tired to think clearly, and have some interference in their physical functioning.

      4. Discussion

      The current research substantially extends the evidence base for the reliability, validity, and applicability of the PROMIS Fatigue (MS) 8a in MS populations. This was accomplished using a comprehensive, mixed methods research design that included both cross-sectional and longitudinal analyses. The concurrence of results from different samples collected in two countries lends confidence in the generalizability of the findings.
      The results of cognitive interviews largely supported the comprehensiveness of the PROMIS Fatigue (MS) 8a's content. However, three of 29 individuals suggested addition of items related to needing to rest or nap during the day to the short form. The complexity of the relationship between diurnal sleep and fatigue has been previously demonstrated.
      • Mills RJ
      • Young CA.
      The relationship between fatigue and other clinical features of multiple sclerosis.
      Although higher fatigue appears related to longer duration of sleep during the day, sleepiness (need to sleep) was only marginally related to fatigue (rho <0.3).
      • Mills RJ
      • Young CA.
      The relationship between fatigue and other clinical features of multiple sclerosis.
      Sleeping during the day is one among various strategies patients use in coping with fatigue, but not all patients want or are able to use this strategy. Regardless of fatigue level, for many people with MS (e.g. people working outside home, parents with small children) sleeping during the day may not be an option. Items whose responses are impacted substantially by factors other than the construct of interest, result in data “noise”, decreasing precision and reliability. For this reason, the research team considered it inappropriate to add items that queried respondents regarding their need to sleep during the day to the PROMIS Fatigue (MS) 8a.
      The mixed findings related to the goodness-of-fit of the one-factor CFA model should be interpreted with all other factor analyses results in mind, which broadly supported unidimensionality of the PROMIS Fatigue (MS) 8a score. Given the sensitivity of CFA goodness-of-fit statistics to factors other than data dimensionality, reliance on goodness-of-fit criteria for determining unidimensionality is not recommended.
      • Cook KF
      • Kallen MA
      • Amtmann D
      Having a fit: impact of number of items and distribution of data on traditional criteria for assessing IRT's unidimensionality assumption.
      The factor analytic results supported the unidimensionality of the PROMIS Fatigue (MS) 8a scores, thus current results confirm that a single summary score is adequate and efficient for summarizing patients’ fatigue.
      The PROMIS Fatigue (MS) 8a was sensitive to bidirectional changes in fatigue over 24- and 52-week follow-up durations; mild-to-moderate effect sizes were obtained. This result echoes previous studies that used PROMIS Fatigue measures in MS populations. For example, a small randomized controlled trial (n=35, study 1; n=27, study 2) evaluating the effectiveness of treatment with transcranial current stimulation on fatigue, assessed the eight-item PROMIS Fatigue SF v1.0 – Fatigue 8a as the primary outcome.
      • Charvet L
      • Serafin D
      • Krupp LB.
      Fatigue in multiple sclerosis.
      Significantly greater reduction in PROMIS fatigue scores was reported in one study [-2.5±7.4 vs -0.2±5.3, p=0.30, Cohen's d=-0.35; -5.6±8.9 vs 0.9±1.9, p=0.02, Cohen's d=-0.71].
      • Cella D
      • Lai JS
      • Jensen SE
      • et al.
      PROMIS fatigue item bank had clinical validity across diverse chronic conditions.
      ,
      • Bingham CO
      • Gutierrez AK
      • Butanis A
      • et al.
      PROMIS fatigue short forms are reliable and valid in adults with rheumatoid arthritis.
      Published evidence on the reliability and validity of scores derived from PROMIS fatigue measures is considerable, consistent, and extends across numerous disease conditions.
      • Cella D
      • Lai JS
      • Jensen SE
      • et al.
      PROMIS fatigue item bank had clinical validity across diverse chronic conditions.
      ,
      • Bingham CO
      • Gutierrez AK
      • Butanis A
      • et al.
      PROMIS fatigue short forms are reliable and valid in adults with rheumatoid arthritis.
      The results of the current study add to the accumulating body of evidence confirming robustness of PROMIS fatigue measures.
      Patterns of correlations between PROMIS Fatigue (MS) 8a scores and other PRO outcome measures were consistent with expectations, with the strongest correlations seen with other measures of fatigue. As fatigue contributes to functional limitations and overall disability burden, it was not surprising to also see at least moderate correlation with measures of physical health or disability. Thus, correlation data supported the convergence validity of PROMIS Fatigue (MS) 8a.
      We propose a score change of 3.4–4 points as MID estimate for minimal improvement and minimal worsening in the PROMIS Fatigue (MS) 8a scores. This range meets key criteria for establishing meaningful change criteria
      US Food and Drug Administration
      Methods to Identify What is Important to Patients & Select.
      ,
      • Coon CD
      • Cook KF.
      Moving from significance to real-world meaning: methods for interpreting change in clinical outcome assessment scores.
      ,
      • Yost KJ
      • Eton DT
      • Garcia SF
      • Cella D
      Minimally important differences were estimated for six patient-reported outcomes measurement information system-cancer scales in advanced-stage cancer patients.
      and compares well with MIDs of 3–5 points established for a 7-item PROMIS fatigue short form in patients with advanced cancer.
      • Yost KJ
      • Eton DT
      • Garcia SF
      • Cella D
      Minimally important differences were estimated for six patient-reported outcomes measurement information system-cancer scales in advanced-stage cancer patients.
      From a practical standpoint, the PROMIS Fatigue (MS) 8a is publicly available, has minimal respondent burden, and is scored on the same mathematical metric as PROMIS Fatigue scores derived by other short forms, the full item bank, and computer adaptive administration.
      • Lai JS
      • Cella D
      • Choi S
      • et al.
      How item banks and their application can influence measurement practice in rehabilitation medicine: a PROMIS fatigue item bank example.
      These practical aspects differentiate the PROMIS Fatigue (MS) 8a from other rigorously developed modern fatigue MS instruments, such as the FSIQ-RMS, and the NFI-MS.
      • Hudgens S
      • Schuler R
      • Stokes J
      • Eremenco S
      • Hunsche E
      • Leist TP
      Development and validation of the FSIQ-RMS: a new patient-reported questionnaire to assess symptoms and impacts of fatigue in relapsing multiple sclerosis.
      ,
      • Mills RJ
      • Young CA
      • Pallant JF
      • Tennant A.
      Development of a patient reported outcome scale for fatigue in multiple sclerosis: The Neurological Fatigue Index (NFI-MS).
      The strength of the current research is multifold. The mixed-methods design allowed evaluation of content validity as well as psychometric properties of the PROMIS Fatigue (MS) 8a. The diversity of the study population is also a strength. Not only were the study participants from two countries, but also included registry participants and MS clinic patients and both relapsing and progressive MS types.

      4.1 Limitations and future directions

      The cognitive interview study included a small number of patients with severe MS, while the observational studies did not include patients older than 65 years of age or with patient-reported EDSS greater than 6.5. Thus, the generalizability of the current study results is limited to patients with MS not requiring a scooter or wheelchair for mobility. These limitations should be viewed in context of the initial development study, which included patients with all levels of MS disability and fatigue.
      • Cook KF
      • Bamer AM
      • Roddey TS
      • Kraft GH
      • Kim J
      • Amtmann D
      A PROMIS fatigue short form for use by individuals who have multiple sclerosis.
      Moreover, previous research
      • Amtmann D
      • Bamer AM
      • Kim J
      • Chung H
      • Salem R
      People with multiple sclerosis report significantly worse symptoms and health related quality of life than the US general population as measured by PROMIS and NeuroQoL outcome measures.
      suggests that levels of fatigue among older adults or those with EDSS >7 ‟do not differ from those of other age groups, or those with slightly less disability” (EDSS 4.5–6.5) suggesting our results may be generalizable to patients of any age and disability level. Our investigation of responsiveness was reliant on identifying patients experiencing changes on anchor variables; overall, the mean change in fatigue for the sample was minimal. An assessment of responsiveness based on a treatment of known efficacy should be a priority for future work.

      5. Conclusion

      Enhancing measurement of fatigue in clinical practice and research is an important step toward better clinical and research tools for the objective of improved management of fatigue in MS. The solid psychometric properties of the PROMIS Fatigue (MS) 8a have been established in prior research and confirmed in the current study. The new findings on test-retest reliability, responsiveness, and score interpretation, fill evidence gaps for the measure. The score interpretation guide may aid integration of PROMIS scores into clinical decision-making as well as facilitate clinician-patient communication. The MID estimates will be useful in evaluating fatigue changes over time in both routine clinical practice and clinical research settings.

      Funding

      This study was sponsored by Merck Healthcare KGaA, Darmstadt, Germany (CrossRef Funder ID: 10.13039/100009945). The sponsor was involved in the study design, data collection and analysis.

      Author Contributions

      Conceptualization and Methodology: Paul Kamudoni; Christian Henke; Karon F. Cook; Dagmar Amtmann. Data curation and Formal analysis: All authors. Roles/Writing - original draft: Paul Kamudoni; Christian Henke; Karon F. Cook; Dagmar Amtmann. Writing - review & editing: All authors. Statistical analysis: Sam Salek; Jeffrey Johns; Karon F. Cook; Rana Salem; Dagmar Amtmann. Project administration: Jana Raab; Rana Salem; Rod Middleton. All authors had access to the data.

      Data sharing

      Any requests for data by qualified scientific and medical researchers for legitimate research purposes will be subject to Merck's Data Sharing Policy. All requests should be submitted in writing to Merck's data sharing portal (https://www.merckgroup.com/en/research/our-approach-to-research-and-development/healthcare/clinical-trials/commitment-responsible-data-sharing.html). When Merck has a co-research, co-development, or co-marketing or co-promotion agreement, or when the product has been out-licensed, the responsibility for disclosure might be dependent on the agreement between parties. Under these circumstances, Merck will endeavor to gain agreement to share data in response to requests. UK MS Register data are accessible via the Register's Secure eResearch platform following appropriate training and formal governance review; the register does not release patient-level data.

      Previous presentation

      This work was presented in part at the ACTRIMS-ECTRIMS Annual Meeting 2020 and ISOQOL Annual Meeting 2020.

      Author Contributions

      Conceptualization and Methodology: Paul Kamudoni; Christian Henke; Karon F. Cook; Dagmar Amtmann. Data curation and Formal analysis: All authors. Roles/Writing - original draft: Paul Kamudoni; Christian Henke; Karon F. Cook; Dagmar Amtmann. Writing - review & editing: All authors. Statistical analysis: Sam Salek; Jeffrey Johns; Karon F. Cook; Rana Salem; Dagmar Amtmann. Project administration: Jana Raab; Rana Salem; Rod Middleton. All authors had access to the data.

      Declaration of Competing Interest

      Paul Kamudoni, ​Christian Henke and Jana Raab are employees of Merck Healthcare KGaA, Darmstadt, Germany. Karon Cook has provided consultancy to Merck Healthcare KGaA, Darmstadt, Germany. Sam Salek has a consultancy contract with Merck Healthcare KGaA, Darmstadt, Germany; and unrestricted educational grants from GSK and the European Haematology Association. Pavle Repovic has acted as a consultant or speaker for Alexion, Biogen, Celgene, EMD Serono Research & Development Institute, Inc., Billerica, MA, USA, an affiliate of Merck KGaA, Medison, Novartis, Roche, Sanofi Genzyme, and Viela Bio. Annette Wundes has received research funding from Alkermes, Biogen, AbbVie and provided consultancy for AbbVie. Dagmar Amtmann has received research funding from EMD Serono Research & Development Institute, Inc., Billerica, MA, USA, an affiliate of Merck KGaA. Rana Salem has received research funding from EMD Serono Research & Development Institute, Inc., Billerica, MA, USA, an affiliate of Merck KGaA. Jeffrey Johns, Kevin N. Alschuler, Gloria von Geldern and Rod Middleton have nothing to disclose.

      Acknowledgements

      Contributions from non-authors: Amy Barrett and Bimpe Olayinka-Amao, employees of RTI Health Solutions, implemented the qualitative research aspects of the study. Medical editing support was provided by Bioscript Stirling Ltd, funded by Merck Healthcare KGaA, Darmstadt, Germany.

      Appendix. Supplementary materials

      References

        • Filippi M
        • Bar-Or A
        • Piehl F
        • et al.
        Multiple sclerosis.
        Nat Rev Dis Primers. 2018; 4: 43
        • Kister I
        • Bacon TE
        • Chamot E
        • et al.
        Natural history of multiple sclerosis symptoms.
        Int J MS Care. 2013; 15: 146-158
        • Mills RJ
        • Young CA.
        The relationship between fatigue and other clinical features of multiple sclerosis.
        Mult Scler. 2011; 17: 604-612
        • Nickerson M
        • Cofield SS
        • Tyry T
        • Salter AR
        • Cutter GR
        • Marrie RA.
        Impact of multiple sclerosis relapse: The NARCOMS participant perspective.
        Mult Scler Relat Disord. 2015; 4: 234-240
        • Elbers RG
        • Rietberg MB
        • van Wegen EE
        • et al.
        Self-report fatigue questionnaires in multiple sclerosis, Parkinson's disease and stroke: a systematic review of measurement properties.
        Qual Life Res. 2012; 21: 925-944
        • Beckerman H
        • Eijssen IC
        • van Meeteren J
        • MC Verhulsdonck
        • de Groot V.
        Fatigue profiles in patients with multiple sclerosis are based on severity of fatigue and not on dimensions of fatigue.
        Sci Rep. 2020; 10: 4167
        • Nordin A
        • Taft C
        • Lundgren-Nilsson A
        • Dencker A.
        Minimal important differences for fatigue patient reported outcome measures-a systematic review.
        BMC Med Res Methodol. 2016; 16: 62
        • US Food and Drug Administration
        Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims.
        Guidance for Industry, 2009
        • US Food and Drug Administration
        Methods to Identify What is Important to Patients & Select.
        Develop or Modify Fit-for-Purpose Clinical Outcomes Assessments (Patient-Focused Drug Development Guidance Public Workshop), 2018
        • Penner IK
        • Paul F.
        Fatigue as a symptom or comorbidity of neurological diseases.
        Nat Rev Neurol. 2017; 13: 662-675
        • Ford DV
        • Jones KH
        • Middleton RM
        • et al.
        The feasibility of collecting information from people with Multiple Sclerosis for the UK MS Register via a web portal: characterising a cohort of people with MS.
        BMC Med Inform Decis Mak. 2012; 12: 73
        • Cook KF
        • Bamer AM
        • Roddey TS
        • Kraft GH
        • Kim J
        • Amtmann D
        A PROMIS fatigue short form for use by individuals who have multiple sclerosis.
        Qual Life Res. 2012; 21: 1021-1030
        • Leddy S
        • Hadavi S
        • McCarren A
        • Giovannoni G
        • Dobson R.
        Validating a novel web-based method to capture disease progression outcomes in multiple sclerosis.
        J Neurol. 2013; 260: 2505-2510
        • Hays RD
        • Bjorner JB
        • Revicki DA
        • Spritzer KL
        • Cella D
        Development of physical and mental health summary scores from the patient-reported outcomes measurement information system (PROMIS) global items.
        Qual Life Res. 2009; 18: 873-880
        • Hobart J
        • Lamping D
        • Fitzpatrick R
        • Riazi A
        • Thompson A.
        The Multiple Sclerosis Impact Scale (MSIS-29): a new patient-based outcome measure.
        Brain. 2001; 124: 962-973
        • Cella DF
        • Dineen K
        • Arnason B
        • et al.
        Validation of the functional assessment of multiple sclerosis quality of life instrument.
        Neurology. 1996; 47: 129-139
        • Hobart JC
        • Riazi A
        • Lamping DL
        • Fitzpatrick R
        • Thompson AJ.
        Measuring the impact of MS on walking ability: the 12-Item MS Walking Scale (MSWS-12).
        Neurology. 2003; 60: 31-36
        • Mills R
        • Young C
        • Nicholas R
        • Pallant J
        • Tennant A.
        Rasch analysis of the Fatigue Severity Scale in multiple sclerosis.
        Mult Scler. 2009; 15: 81-87
        • Learmonth YC
        • Dlugonski D
        • Pilutti LA
        • Sandroff BM
        • Klaren R
        • Motl RW.
        Psychometric properties of the Fatigue Severity Scale and the Modified Fatigue Impact Scale.
        J Neurol Sci. 2013; 331: 102-107
        • Jaeschke R
        • Singer J
        • Guyatt GH.
        Measurement of health status. Ascertaining the minimal clinically important difference.
        Control Clin Trials. 1989; 10: 407-415
        • Gusi N
        • Olivares PR
        • Rajendram R.
        The EQ-5D Health-Related Quality of Life Questionnaire.
        Handbook of Disease Burdens and Quality of Life Measures. 2010; : 87-99
        • Johnston M
        • Pollard B
        • Hennessey P.
        Construct validation of the hospital anxiety and depression scale with clinical populations.
        J Psychosom Res. 2000; 48: 579-584
        • Kroenke K
        • Strine TW
        • Spitzer RL
        • Williams JB
        • Berry JT
        • Mokdad AH.
        The PHQ-8 as a measure of current depression in the general population.
        J Affect Disord. 2009; 114: 163-173
      1. StataCorp LLC. Stata statistical software: Release 15: (2017).

        • Muthén LK
        • Muthén BO.
        Mplus User's Guide..
        Eighth Edition. Los Angeles, CA2017
      2. R Core Team. R: A Language and Environment for Statistical Computing: (2018).

        • Terwee CB
        • Bot SD
        • de Boer MR
        • et al.
        Quality criteria were proposed for measurement properties of health status questionnaires.
        J Clin Epidemiol. 2007; 60: 34-42
        • Coons SJ
        • Gwaltney CJ
        • Hays RD
        • et al.
        Recommendations on evidence needed to support measurement equivalence between electronic and paper-based patient-reported outcome (PRO) measures: ISPOR ePRO Good Research Practices Task Force report.
        Value Health. 2009; 12: 419-429
        • Reeve BB
        • Wyrwich KW
        • Wu AW
        • et al.
        ISOQOL recommends minimum standards for patient-reported outcome measures used in patient-centered outcomes and comparative effectiveness research.
        Qual Life Res. 2013; 22: 1889-1905
        • Fayers PM
        • Machin D.
        Quality of Life: The Assessment, Analysis and Interpretation of Patient-reported Outcomes.
        Wiley, 2013
        • Prinsen CAC
        • Mokkink LB
        • Bouter LM
        • et al.
        COSMIN guideline for systematic reviews of patient-reported outcome measures.
        Qual Life Res. 2018; 27: 1147-1157
        • Coon CD
        • Cook KF.
        Moving from significance to real-world meaning: methods for interpreting change in clinical outcome assessment scores.
        Qual Life Res. 2018; 27: 33-40
        • Yost KJ
        • Eton DT
        • Garcia SF
        • Cella D
        Minimally important differences were estimated for six patient-reported outcomes measurement information system-cancer scales in advanced-stage cancer patients.
        J Clin Epidemiol. 2011; 64: 507-516
        • Cohen J.
        Statistical Power Analysis for the Behavioral Sciences. 1987; (revised ed.:)
        • Cook KF
        • Kallen MA
        • Amtmann D
        Having a fit: impact of number of items and distribution of data on traditional criteria for assessing IRT's unidimensionality assumption.
        Qual Life Res. 2009; 18: 447-460
        • Charvet L
        • Serafin D
        • Krupp LB.
        Fatigue in multiple sclerosis.
        Fatigue: Biomedicine, Health & Behavior. 2014; 2: 3-13
        • Cella D
        • Lai JS
        • Jensen SE
        • et al.
        PROMIS fatigue item bank had clinical validity across diverse chronic conditions.
        J Clin Epidemiol. 2016; 73: 128-134
        • Bingham CO
        • Gutierrez AK
        • Butanis A
        • et al.
        PROMIS fatigue short forms are reliable and valid in adults with rheumatoid arthritis.
        J Patient Rep Outcomes. 2019; 3: 14
        • Lai JS
        • Cella D
        • Choi S
        • et al.
        How item banks and their application can influence measurement practice in rehabilitation medicine: a PROMIS fatigue item bank example.
        Arch Phys Med Rehabil. 2011; 92: S20-S27
        • Hudgens S
        • Schuler R
        • Stokes J
        • Eremenco S
        • Hunsche E
        • Leist TP
        Development and validation of the FSIQ-RMS: a new patient-reported questionnaire to assess symptoms and impacts of fatigue in relapsing multiple sclerosis.
        Value Health. 2019; 22: 453-466
        • Mills RJ
        • Young CA
        • Pallant JF
        • Tennant A.
        Development of a patient reported outcome scale for fatigue in multiple sclerosis: The Neurological Fatigue Index (NFI-MS).
        Health Qual Life Outcomes. 2010; 8: 22
        • Amtmann D
        • Bamer AM
        • Kim J
        • Chung H
        • Salem R
        People with multiple sclerosis report significantly worse symptoms and health related quality of life than the US general population as measured by PROMIS and NeuroQoL outcome measures.
        Disabil Health J. 2018; 11: 99-107