№1, 2025

APPLICATION OF THE WORD2VEC ALGORITHM FOR CLINICAL DIAGNOSIS DETERMINATION
Uzeyir Gurbanli

The article examines the use of Natural Language Processing technologies in modern medicine for knowledge acquisition and the implementation of decision-making methods. The application of information technology in healthcare has become one of the main requirements of the modern era. Enhancing the quality and accessibility of medical services necessitates the utilization of modern technologies, mathematical methods, and the capabilities of artificial intelligence, alongside the development of comprehensive information systems. The paper proposes methods for analyzing and applying tools to assist physicians in diagnosing conditions and determining treatment plans with the help of artificial intelligence. It also focuses on evaluating the quality of diagnostic and treatment processes through applıcation of different methods and practical application. The analysis of large volumes of medical data using Natural Language Processing technology enables the extraction of valuable insights. A significant portion of medical data is stored and exchanged in text form compliant with the Health Level Seven standard, making semantic similarity methods that operate on textual data highly effective in this domain. By designing and implementing rules for applying and integrating different algorithms, it is possible to transform medical data into valuable knowledge, contributing significantly to advancements in the medical field. The article presents a Word2Vec algorithm-based approach for detecting diagnoses of cardiovascular diseases from collected patient histories, as well as refining existing diagnoses. The development of an algorithm capable of assigning new diagnoses based on historical patient records constitutes one of the key outcomes of this research (pp.26-33).

Keywords:Natural Language Processing, Diagnosis prediction International, Classification of Diseases, Hospital İnformation System, Artificial intelligence, Health Level Seven, Electronic Healthcare Records
References
  • Abdelhakim, A. E., Elhoseny, M., & Farouk, A. (2020). Hybrid Intelligent Framework for Word2Vec-Based Sentiment Analysis Using Gated Recurrent Unit Network and Particle Swarm Optimization. IEEE Access, 8:152385–152397.
  • Antonio Desai, Aurora Zumbo, Mauro Giordano (2022). Word2vec Word Embedding-Based Artificial Intelligence Model in the Triage of Patients with Suspected Diagnosis of Major Ischemic Stroke: A Feasibility Study. International Journal of Environmental Research and Public Health. https://doi.org/10.3390/ijerph192215295
  • Aytan Ahmadova (2024). Applications of digital twins in medicine and the ontological model of medical digital twins. Problems of Information Society, 15(1):98-105. http://doi.org/10.25045/jpis.v15.i1.10
  • Bofang Li, Aleksandr Drozd, Yuhe Guo, Tao Liu, Satoshi Matsuoka & Xiaoyong Du (2019). Scaling Word2Vec on Big Corpus, Data Science and Engineering, 4:157-175. https://link.springer.com/article/10.1007/s41019-019-0096-6
  • Faiza Khan Khattak, Serena Jeblee, Chloé Pou-Prom, Mohamed Abdalla, Christopher Meaney, Frank Rudzicz (2019). A survey of word embeddings for clinical text. Journal of Biomedical Informatics, 100:1-18.https://doi.org/10.1016/j.yjbinx.2019.100057
  • Jiho Noh, Ramakanth Kavuluru (2021). Improved biomedical word embeddings in the transformer era. Journal of Biomedical Informatics, 120:1-11. https://doi.org/10.1016/j.jbi.2021.103867
  • Li, Y., & Yang, T. (2018). Word embedding for understanding natural language: a survey. Guide to big data applications, 83-104.
  • Devika M.D., Sunitha C., Ganesh Amal (2016). Sentiment Analysis: A Comparative Study on Different Approaches. Procedia Computer Science, 87: 44-49. https://doi.org/10.1016/j.procs.2016.05.124
  • Mammadova Masuma H. (2016). Bıg data ın electronıc medıcıne: opportunıtıes, challenges and perspectıves. Problems of Information Technology, 7(2):8/24 https://doi.org/10.25045/jpit.v07.i2.02
  • Mammadova, M.H., & Jabrayilova, Z.G. (2019). Electronic medicine: formation and scientific-theoretical problems. Baku: "Information Technologies" publishing house, 1-318. https://ict.az/uploads/files/E-medicine-monograph-IIT-ANAS.pdf
  • Mikolov T., Sutskever I., Chen K., Corrado G., Dean J. (2013). Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 1-9. https://doi.org/10.48550/arXiv.1310.4546
  • Oubenali, N., Messaoud, S., Filiot, A. et al. (2022). Visualization of medical concepts represented using word embeddings: a scoping review. BMC Med Inform Decis Mak 22(83):1-14. 
    https://doi.org/10.1186/s12911-022-01822-9
  • Ruder, S., Vulić, I., & Sogaard, A. (2019). A survey of cross-lingual word embeddingmodels. Journal of Artificial Intelligence Research, 65:569-631.
  • Young, T., Hazarika, D., Poria, S., & Cambria, E. (2018). Recent Trends in Deep Learning Based Natural Language Processing. IEEE Computational Intelligence Magazine, 13(3):55–75. https://doi.org/10.1109/MCI.2018.2840738
  • Zhang, Y., Jin, R., & Zhou, Z.-H. (2010). Understanding Bag-of-Words Model: A Statistical Framework. IEEE Transactions on Image Processing, 19(4):944–963.