| Home | E-Submission | Sitemap | Contact us |  
top_img
Sleep Med Res > Volume 15(4); 2024 > Article
Jeong and Kim: Less Is More: Machine Learning-Based Shortened Sleep Questionnaires for Efficient Clinical Practice
Traditional sleep questionnaires have long been regarded as gold standards for assessing various aspects of sleep disorders [1,2]. Despite their effectiveness, these tools are often criticized for being too lengthy and time-consuming, which can lead to reduced patient compliance and data quality. This is particularly problematic in busy clinical settings and large-Recent studies have demonstrated that machine learning algorithms, such as eXtreme Gradient Boosting (XGBoost) and Random Forest, can be leveraged to identify a subset of items from traditional questionnaires that provide high predictive power for total scores and clinical classifications [3-8]. For example, Jo et al. [3] developed a data-driven shortened version of the Dysfunctional Beliefs and Attitudes about Sleep (DBAS)-16, called the DBAS-6, using exploratory factor analysis (EFA) and XGBoost. The DBAS-6, which consists of just six items, achieved an R² value of 0.90 for predicting the DBAS-16 total score, making it a highly efficient tool for clinical settings. Similarly, Lee et al. [5] applied the random forest algorithm to create two shortened versions of the Metacognitions Questionnaire-Insomnia (MCQ-I)—MCQI-6 and MCQI-14. The six-item version (MCQI-6) showed a high area under the receiver operating characteristic curve (AUROC>0.97), demonstrating its capacity to distinguish individuals with clinically significant insomnia from those without.
scale research studies. Given these challenges, a growing body of research has explored the potential of machine learning to develop shortened versions of these questionnaires without compromising their psychometric properties. I believe that these innovative approaches mark a significant advancement in sleep disorder assessment and could pave the way for more efficient and scalable clinical and research practices.
One of the main advantages of using machine learning for questionnaire reduction is the preservation of psychometric properties. Traditional methods for shortening questionnaires often rely on classical test theory or principal component analysis, which may not fully capture complex interactions among items. Machine learning algorithms, on the other hand, allow for the selection of items based on their importance in predicting the total score or classification outcome, thereby ensuring that the shortened questionnaire retains its predictive validity. For example, Jo et al. [4] utilized both EFA and XGBoost to develop the Insomnia Severity Index (ISI)-3m, a three-item version of the ISI. This shortened version outperformed several previously developed shortened versions of the ISI, achieving an R² value of 0.91 and an accuracy of 0.965 for classifying incomnia severity levels.
However, despite these clear benefits, machine learning-based shortened questionnaires often face practical challenges in clinical settings. Integrating machine learning methods into existing medical systems can be complex and resource-intensive. Additionally, the application of these methods typically requires extensive training or the involvement of specialized professionals, both of which are costly and time-consuming. Moreover, these machine learning-based questionnaires are often perceived as ‘black boxes,’ making them difficult for medical professionals to understand and trust. To tackle these problems and align with the principles of explainable artificial intelligence, new methodologies have been developed. For instance, Xie et al. [9] introduced AutoScore, an automatic clinical score generator that combines machine learning with regression modeling. AutoScore uses a random forest algorithm to select key questions from the original questionnaire and then groups responses to form logistic models that predict risk scores. This simple conversion of model coefficients to response weights results in a user-friendly, shortened questionnaire. However, AutoScore’s manual response grouping introduces subjectivity, and it lacks monotonicity constraints, limiting its clinical interpretability. To address these issues, Cawiding et al. [10] developed Symscore. SymScore automates response grouping, enforces monotonicity, and enhances flexibility, offering a more robust and interpretable solution.
In conclusion, the application of machine learning to shorten sleep questionnaires is a promising development that could revolutionize the field of sleep medicine. By reducing the burden of lengthy assessments on both patients and clinicians, these shortened tools can improve compliance and data quality, ultimately leading to better diagnosis and treatment of sleep disorders. However, to realize their full potential, further research and innovation are needed to address the current limitations of integrating these models into clinical practice. With continued advancements in machine learning, we can expect to see more efficient and effective tools for assessing a wide range of psychological and medical conditions.

NOTES

Author Contributions
Conceptualization: Jae Kyoung Kim. Data curation: Eui Min Jeong. Formal analysis: all authors. Funding acquisition: Jae Kyoung Kim. Investigation: all authors. Methodology: all authors. Project administration: Jae Kyoung Kim. Resources: Jae Kyoung Kim. Software: Eui Min Jeong. Supervision: Jae Kyoung Kim. Validation: Jae Kyoung Kim. Visualization: Eui Min Jeong. Writing—original draft: all authors. Writing—review & editing: all authors.
Conflicts of Interest
The authors have no potential conflicts of interest to disclose.
Funding Statement
This research was supported by the Institute for Basic Science (IBS-R029-C3) and the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (NRF-2022M3J6A1063021).

ACKNOWLEDGEMENTS

None

REFERENCES

1. Morin CM, Belleville G, Belanger L, Ivers H. The Insomnia Severity Index: psychometric indicators to detect insomnia cases and evaluate treatment response. Sleep 2011;34:601-8.
crossref pmid pmc
2. Espie CA, Kyle SD, Hames P, Gardani M, Fleming L, Cape J. The Sleep Condition Indicator: a clinical screening tool to evaluate insomnia disorder. BMJ Open 2014;4:e004183.
crossref pmid pmc
3. Jo H, Jeon HJ, Ahn J, Jeon S, Kim JK, Chung S. Dysfunctional Beliefs and Attitudes about Sleep-6 (DBAS-6): data-driven shortened version from a machine learning approach. Sleep Med 2024;119:312-8.
crossref pmid
4. Jo H, Lim M, Jeon HJ, Ahn J, Jeon S, Kim JK, et al. Data-driven shortened Insomnia Severity Index (ISI): a machine learning approach. Sleep Breath 2024;28:1819-30.
crossref pmid
5. Lee J, Ha S, Ahmed O, Cho IK, Lee D, Kim K, et al. Validation of the Korean version of the Metacognitions Questionnaire-Insomnia (MCQI) scale and development of shortened versions using the random forest approach. Sleep Med 2022;98:53-61.
crossref pmid
6. Ha S, Choi SJ, Lee S, Wijaya RH, Kim JH, Joo EY, et al. Predicting the risk of sleep disorders using a machine learning-based simple questionnaire: development and validation study. J Med Internet Res 2023;25:e46520.
crossref pmid pmc
7. Duda M, Ma R, Haber N, Wall DP. Use of machine learning for behavioral distinction of autism and ADHD. Transl Psychiatry 2016;6:e732.
crossref pmc
8. Nambo R, Karashima S, Mizoguchi R, Konishi S, Hashimoto A, Aono D, et al. Prediction and causal inference of cardiovascular and cerebrovascular diseases based on lifestyle questionnaires. Sci Rep 2024;14:10492.
crossref pmid pmc
9. Xie F, Chakraborty B, Ong MEH, Goldstein BA, Liu N. AutoScore: a machine learning-based automatic clinical score generator and its application to mortality prediction using electronic health records. JMIR Med Inform 2020;8:e21798.
crossref pmid pmc
10. Cawiding OR, Lee S, Jo H, Kim S, Suh S, Joo EY, et al. SymScore: machine learning accuracy meets transparency in a symbolic regressionbased clinical score generator. Comput Biol Med 2025;185:109589.
crossref
TOOLS
PDF Links  PDF Links
PubReader  PubReader
ePub Link  ePub Link
XML Download  XML Download
Full text via DOI  Full text via DOI
Download Citation  Download Citation
  Print
Share:      
METRICS
0
Crossref
0
Scopus
184
View
8
Download