Emerging Research Trends in Natural Language Processing for Multilingual AI

Main Article Content

Loso Judijanto
Arnes Yuli Vandika

Abstract

This study explores the emerging trends and developments in Natural Language Processing (NLP) for Multilingual Artificial Intelligence (AI) through a comprehensive bibliometric analysis. Drawing on data from the Scopus database spanning 2013 to 2023, the research identifies key publication patterns, influential contributors, thematic clusters, and collaboration networks that shape the evolution of multilingual NLP. The analysis reveals a significant increase in research activity over the past five years, particularly driven by advancements in deep learning and the emergence of multilingual pretrained models such as mBERT and XLM-RoBERTa. Institutions from the United States, India, and China lead the global research landscape, while collaborative clusters highlight the interdisciplinary and international nature of the field. Keyword analysis shows a paradigm shift from rule-based and statistical approaches to neural and transformer-based architectures, with increasing application in healthcare, social media, and big data environments. Despite this growth, the study identifies ongoing challenges, including disparities in language representation, bias in model training, and the need for ethical and inclusive research practices. The findings provide a strategic overview for researchers, policymakers, and practitioners aiming to advance equitable and effective multilingual AI systems.

Article Details

How to Cite
Judijanto, L., & Vandika, A. Y. (2025). Emerging Research Trends in Natural Language Processing for Multilingual AI. The Eastasouth Journal of Information System and Computer Science, 2(03), 187–199. https://doi.org/10.58812/esiscs.v2i03.549
Section
Articles

References

G. G. Krishna, “Multilingual NLP,” Int. J. Adv. Eng. Nano Technol., vol. 10, no. 6, pp. 9–12, 2023.

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), 2019, pp. 4171–4186.

L. W. Y. Yang et al., “Development and testing of a multi-lingual Natural Language Processing-based deep learning system in 10 languages for COVID-19 pandemic crisis: A multi-center study,” Front. public Heal., vol. 11, p. 1063466, 2023.

E. Razumovskaia, G. Glavaš, O. Majewska, E. Ponti, and I. Vulić, “Natural language processing for multilingual task-oriented dialogue,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts, 2022, pp. 44–50.

M. Z. Hossain and S. Goyal, “Advancements in Natural Language Processing: Leveraging Transformer Models for Multilingual Text Generation,” Pacific J. Adv. Eng. Innov., vol. 1, no. 1, pp. 4–12, 2024.

E. Razumovskaia, G. Glavas, O. Majewska, E. M. Ponti, A. Korhonen, and I. Vulic, “Crossing the conversational chasm: A primer on natural language processing for multilingual task-oriented dialogue systems,” J. Artif. Intell. Res., vol. 74, pp. 1351–1402, 2022.

M. Orosoo, I. Goswami, F. R. Alphonse, G. Fatma, M. Rengarajan, and B. K. Bala, “Enhancing Natural Language Processing in Multilingual Chatbots for Cross-Cultural Communication,” in 2024 5th International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV), IEEE, 2024, pp. 127–133.

R. P. Pramila and S. Varsha, “Artificial Intelligence and Natural Language Processing (NLP) Integrated Multilingual Health Insurance Application,” in 2024 3rd International Conference on Automation, Computing and Renewable Systems (ICACRS), IEEE, 2024, pp. 1141–1145.

O. Zennaki, N. Semmar, and L. Besacier, “A neural approach for inducing multilingual resources and natural language processing tools for low-resource languages,” Nat. Lang. Eng., vol. 25, no. 1, pp. 43–67, 2019.

K. Haider, “Natural Language Processing in AI-Powered Systems: Techniques and Future Prospects,” J. AI Spectr., vol. 1, no. 1, pp. 40–53, 2024.

A. Conneau and G. Lample, “Cross-lingual language model pretraining,” Adv. Neural Inf. Process. Syst., vol. 32, 2019.

G. Lopez, M. Artetxe, M. Amutio, J. Bilbao, and M. Olazar, “Thermochemical routes for the valorization of waste polyolefinic plastics to produce fuels and chemicals. A review,” Renew. Sustain. Energy Rev., vol. 73, pp. 346–368, 2017.

A. Joshi, B. Neely, C. Emrich, D. Griffiths, and G. George, “Gender research in AMJ: an overview of five decades of empirical research and calls to action: thematic issue on gender in management research,” Academy of Management Journal, vol. 58, no. 5. Academy of Management Briarcliff Manor, NY, pp. 1459–1475, 2015.

Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015.

A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv Prepr. arXiv2010.11929, 2020.

A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.

C. Raffel et al., “Exploring the limits of transfer learning with a unified text-to-text transformer,” J. Mach. Learn. Res., vol. 21, no. 140, pp. 1–67, 2020.

X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proceedings of the thirteenth international conference on artificial intelligence and statistics, JMLR Workshop and Conference Proceedings, 2010, pp. 249–256.

G. A. Miller, “WordNet: a lexical database for English,” Commun. ACM, vol. 38, no. 11, pp. 39–41, 1995.

M. Abadi et al., “Tensorflow: Large-scale machine learning on heterogeneous distributed systems,” arXiv Prepr. arXiv1603.04467, 2016.

P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching word vectors with subword information,” Trans. Assoc. Comput. Linguist., vol. 5, pp. 135–146, 2017.