Accelerating Drug Discovery: The Role of Generative AI and Big Data Analytics

Main Article Content

Ramchorn Gharami
Delwar Karim
Jhon Kabir
Rashid Khan

Abstract

Drug discovery has long been characterized by extensive timelines, high costs, and significant risks, often taking more than a decade and billions of dollars to bring a single drug to market. However, the convergence of generative artificial intelligence (AI) and big data analytics is fundamentally reshaping this landscape. This paper provides an in-depth analysis of generative AI especially models such as generative adversarial networks (GANs), variational autoencoders (VAEs), and transformer-based architectures combined with vast biological and chemical datasets, is transforming molecular design, target identification, and compound optimization. Through a systematic review of literature, comparative model evaluation, and real-world case studies including AlphaFold, the paper explores the efficacy of these technologies in accelerating drug discovery. A hybrid methodology combining data mining, model testing, and bioinformatics simulation is employed. The results demonstrate significant improvements in candidate molecule generation, predictive modeling accuracy, and time-to-market for new drugs. Future challenges such as data interoperability, ethical considerations, and regulatory compliance are also discussed. The study concludes by highlighting the immense potential of AI and big data in ushering a new era of precision medicine and personalized therapeutics.

Article Details

How to Cite
Gharami, R., Karim, D., Kabir, J., & Khan, R. (2025). Accelerating Drug Discovery: The Role of Generative AI and Big Data Analytics. The Eastasouth Journal of Information System and Computer Science, 3(01), 11–20. https://doi.org/10.58812/esiscs.v3i01.606
Section
Articles

References

J. A. DiMasi, H. G. Grabowski, and R. W. Hansen, “Innovation in the pharmaceutical industry: New estimates of R&D costs,” J. Health Econ., vol. 47, pp. 20–33, 2016, doi: https://doi.org/10.1016/j.jhealeco.2016.01.012.

G. T. Alam et al., “AI-Driven Optimization of Domestic Timber Supply Chains to Enhance U.S. Economic Security,” J. Posthumanism, vol. 5, no. 1, pp. 1581–1605, 2025, doi: https://doi.org/10.63332/joph.v4i3.2083.

J. W. Scannell, A. Blanckley, H. Boldon, and B. Warrington, “Diagnosing the decline in pharmaceutical R&D efficiency,” Nat. Rev. Drug Discov., vol. 11, no. 3, pp. 191–200, 2012.

M. A. Miah et al., “Big Data Analytics for Enhancing Coal-Based Energy Production Amidst AI Infrastructure Growth,” J. Posthumanism, vol. 5, no. 5, pp. 5061–5080, 2025, doi: https://doi.org/10.63332/joph.v5i5.2087.

M. M. T. G. Manik, “Integrative Analysis of Heterogeneous Cancer Data Using Autoencoder Neural Networks,” J. Inf. Syst. Eng. Manag., vol. 10, no. 3s, pp. 548–554, 2025, doi: https://doi.org/10.52783/jisem.v10i3s.4746.

M. M. T. G. Manik et al., “AI-Driven Precision Medicine Leveraging Machine Learning and Big Data Analytics for Genomics-Based Drug Discovery,” J. Posthumanism, vol. 5, no. 1, pp. 1560–1580, 2025, doi: https://doi.org/10.63332/joph.v5i1.1993.

A. A. M. Ashik, M. M. Rahman, E. Hossain, M. S. Rahman, S. Islam, and S. I. Khan, “Transforming U.S. Healthcare Profitability through Data-Driven Decision Making: Applications, Challenges, and Future Directions,” Eur. J. Med. Heal. Res., vol. 1, no. 3, pp. 116–125, 2023, doi: https://doi.org/10.59324/ejmhr.2023.1(3).21.

J. Hassan et al., “Emerging Trends and Performance Evaluation of Eco-Friendly Construction Materials for Sustainable Urban Development,” J. Mech. Civ. Ind. Eng., vol. 2, no. 2, pp. 80–90, 2022, doi: https://doi.org/10.32996/jmcie.2021.2.2.11.

M. M. T. G. Manik, “Multi-Omics System Based on Predictive Analysis with AI-Driven Models for Parkinson’s Disease (PD) Neurosurgery,” J. Med. Heal. Stud., vol. 2, no. 1, pp. 42–52, 2021, doi: https://doi.org/10.32996/jmhs.2021.2.1.5.

M. M. T. G. Manik, “An Analysis of Cervical Cancer using the Application of AI and Machine Learning,” J. Med. Heal. Stud., vol. 3, no. 2, pp. 67–76, 2022, doi: https://doi.org/10.32996/jmhs.2022.3.2.11.

M. S. Islam et al., “Explainable AI in Healthcare: Leveraging Machine Learning and Knowledge Representation for Personalized Treatment Recommendations,” J. Posthumanism, vol. 5, no. 1, pp. 1541–1559, 2025, doi: https://doi.org/10.63332/joph.v5i1.1996.

F. Mahmud et al., “AI-Driven Cybersecurity in IT Project Management: Enhancing Threat Detection and Risk Mitigation,” J. Posthumanism, vol. 5, no. 4, pp. 23–44, 2025, doi: https://doi.org/10.63332/joph.v5i4.974.

M. E. Hossin et al., “Digital Transformation in the USA Leveraging AI and Business Analytics for IT Project Success in the Post-Pandemic Era,” J. Posthumanism, vol. 5, no. 4, pp. 958–976, 2025, doi: https://doi.org/10.63332/joph.v5i4.1180.

D. Hossain, M. Asrafuzzaman, S. Dash, and S. Rani, “Multi-Scale Fire Dynamics Modeling: Integrating Predictive Algorithms for Synthetic Material Combustion in Compartment Fires,” J. Manag. World, vol. 5, pp. 363–374, 2024, doi: https://doi.org/10.53935/jomw.v2024i4.1133.

U. Haldar et al., “AI-Driven Business Analytics for Economic Growth Leveraging Machine Learning and MIS for Data-Driven Decision-Making in the U.S. Economy,” J. Posthumanism, vol. 5, no. 4, pp. 932–957, 2025, doi: https://doi.org/10.63332/joph.v5i4.1178.

S. Sultana et al., “A Comparative Review of Machine Learning Algorithms in Supermarket Sales Forecasting with Big Data,” J. Ecohumanism, vol. 3, no. 8, pp. 14457–14467, 2024, doi: https://doi.org/10.62754/joe.v3i8.6762.

S. Hossain et al., “Big Data Analysis and prediction of COVID-2019 Epidemic Using Machine Learning Models in Healthcare Sector,” J. Ecohumanism, vol. 3, no. 8, pp. 14468–14477, 2024, doi: https://doi.org/10.62754/joe.v3i8.6775.

M. M. T. G. Manik, M. M. R. Bhuiyan, M. Moniruzzaman, M. S. Islam, S. Hossain, and S. Hossain, “The Future of Drug Discovery Utilizing Generative AI and Big Data Analytics for Accelerating Pharmaceutical Innovations,” Nanotechnol. Perceptions, vol. 14, no. 3, pp. 120–135, 2018, doi: https://doi.org/10.62441/nano-ntp.v14i3.4766.

J. P. Hughes, S. Rees, S. B. Kalindjian, and K. L. Philpott, “Principles of early drug discovery,” Br. J. Pharmacol., vol. 162, no. 6, pp. 1239–1249, 2011.

D. K. Alasa, D. Hossain, and G. Jiyane, “Hydrogen Economy in GTL: Exploring the role of hydrogen-rich GTL processes in advancing a hydrogen-based economy,” Int. J. Commun. Networks Inf. Secur., vol. 17, no. 1, pp. 81–91, 2025, [Online]. Available: https://www.ijcnis.org/index.php/ijcnis/article/view/8021

C. R. Barikdar et al., “MIS Frameworks for Monitoring and Enhancing U.S. Energy Infrastructure Resilience,” J. Posthumanism, vol. 5, no. 5, pp. 4327–4342, 2025, doi: https://doi.org/10.63332/joph.v5i5.1907.

J. Hassan et al., “Implementing MIS Solutions to Support the National Energy Dominance Strategy,” J. Posthumanism, vol. 5, no. 5, pp. 4343–4363., 2025, doi: https://doi.org/10.63332/joph.v5i5.1908.

M. Moniruzzaman et al., “Big Data Strategies for Enhancing Transparency in U.S. Healthcare Pricing,” J. Posthumanism, vol. 5, no. 5, pp. 3744–3766, 2025, doi: https://doi.org/10.63332/joph.v5i5.1813.

F. S. Collins, M. Morgan, and A. Patrinos, “The Human Genome Project: Lessons from large-scale biology,” Sci. 300, pp. 286–290, 2003, doi: https://doi.org/10.1126/science.1084564.

ENCODE Project Consortium, “An integrated encyclopedia of DNA elements in the human genome,” Nature, vol. 489, no. 7414, pp. 57–74, 2012, doi: https://doi.org/10.1038/nature11247.

A. Gaulton et al., “The ChEMBL database in 2017,” Nucleic Acids Res., vol. 45, no. D1, pp. D945–D954, 2017, doi: https://doi.org/10.1093/nar/gkw1074.

S. Kim et al., “PubChem Substance and Compound databases,” Nucleic Acids Res., vol. 49, no. D1, pp. D1388–D1395, 2021, doi: https://doi.org/10.1093/nar/gkaa971.

J. Corrigan-Curay, L. Sacks, and J. Woodcock, “Real-world evidence and real-world data for evaluating drug safety and effectiveness,” JAMA, vol. 320, no. 9, pp. 867–868, 2018, doi: https://doi.org/10.1001/jama.2018.10136.

K. Das, A. Tanvir, S. Rani, and F. M. Aminuzzaman, “Revolutionizing Agro-Food Waste Management: Real-Time Solutions through IoT and Big Data Integration,” Voice Publ., vol. 11, no. 1, pp. 17–36, 2025, doi: https://doi.org/10.4236/vp.2025.111003.

L. Wang, C. A. Alexander, and D. Anastasiu, “Wearable technologies and big data analytics for smart and connected health,” Healthcare, vol. 7, no. 4, p. 150, 2019, doi: https://doi.org/10.3390/healthcare7040150.

M. A. Goffer et al., “AI-Enhanced Cyber Threat Detection and Response Advancing National Security in Critical Infrastructure,” J. Posthumanism, vol. 5, no. 3, pp. 1667–1689, 2025, doi: https://doi.org/10.63332/joph.v5i3.965.

S. Islam, E. Hossain, M. S. Rahman, M. M. Rahman, S. I. Khan, and A. A. M. Ashik, “Digital Transformation in SMEs: Unlocking Competitive Advantage through Business Intelligence and Data Analytics Adoption,” J. Bus. Manag. Stud., vol. 5, no. 6, pp. 177–186, 2023, doi: https://doi.org/10.32996/jbms.2023.5.6.14.

H. D., A. D.K, and J. G., “Water-based fire suppression and structural fire protection: strategies for effective fire control,” Int. J. Commun. Networks Inf. Secur., vol. 15, no. 4, pp. 485–94, 2023, [Online]. Available: https://ijcnis.org/index.php/ijcnis/article/view/7982.

S. Hossain et al., “From Data to Value: Leveraging Business Analytics for Sustainable Management Practices,” J. Posthumanism, vol. 5, no. 5, pp. 82–105, 2025, doi: https://doi.org/10.63332/joph.v5i5.1309.

H. D. and A. D.K., “Numerical modeling of fire growth and smoke propagation in enclosure,” J. Manag. World, vol. 5, pp. 186–196, 2024, doi: https://doi.org/10.53935/jomw.v2024i4.1051.

H. D. and A. D.K., “Fire detection in gas-to-liquids processing facilities: challenges and innovations in early warning systems,” Int. J. Biol. Phys. Chem. Stud., vol. 6, no. 2, pp. 7–13, 2024, doi: https://doi.org/10.32996/ijbpcs.2024.6.2.2.

E. Hossain, A. A. M. Ashik, M. M. Rahman, S. I. Khan, M. S. Rahman, and S. Islam, “Big data and migration forecasting: Predictive insights into displacement patterns triggered by climate change and armed conflict,” J. Comput. Sci. Technol. Stud., vol. 5, no. 4, pp. 265–274, 2023, doi: https://doi.org/10.32996/jcsts.2023.5.4.27.

M. J. Page et al., “The PRISMA 2020 statement: an updated guideline for reporting systematic reviews,” BMJ, vol. 372, no. n71, 2021, doi: https://doi.org/10.1136/bmj.n71.

T. Sterling and J. J. Irwin, “ZINC 15 – Ligand discovery for everyone,” J. Chem. Inf. Model., vol. 55, no. 11, pp. 2324–2337, 2015, doi: https://doi.org/10.1021/acs.jcim.5b00559.

OpenBioML, “GitHub Repository,” 2022. https://github.com/OpenBioML

S. Chithrananda, G. Grand, and B. Ramsundar, “ChemBERTa: Large-scale self-supervised pretraining for molecular property prediction,” arXiv Prepr., 2020, doi: https://doi.org/10.48550/arXiv.2010.09885.

M. A. Miah, E. Rozario, F. B. Khair, M. K. Ahmed, M. M. R. Bhuiyan, and M. M. T. G. Manik, “Harnessing Wearable Health Data and Deep Learning Algorithms for Real-Time Cardiovascular Disease Monitoring and Prevention,” Nanotechnol. Perceptions, vol. 15, no. 3, pp. 326–349, 2019, doi: https://doi.org/10.62441/nano-ntp.v15i3.5278.

M. M. T. G. Manik, “Biotech-Driven Innovation in Drug Discovery: Strategic Models for Competitive Advantage in the Global Pharmaceutical Market,” J. Comput. Anal. Appl., vol. 28, no. 6, pp. 41–47, 2020, [Online]. Available: https://eudoxuspress.com/index.php/pub/article/view/2874

M. M. T. G. Manik et al., “The Role of Big Data in Combatting Antibiotic Resistance Predictive Models for Global Surveillance,” Nanotechnol. Perceptions, vol. 16, no. 3, pp. 361–378, 2020, doi: https://doi.org/10.62441/nano-ntp.v16i3.5445.

O. Trott and A. J. Olson, “AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading,” J. Comput. Chem., vol. 31, no. 2, pp. 455–461, 2010, doi: https://doi.org/10.1002/jcc.21334.

N. Brown, M. Fiscato, M. H. S. Segler, and A. C. Vaucher, “GuacaMol: Benchmarking models for de novo molecular design,” J. Chem. Inf. Model., vol. 59, no. 3, pp. 1096–1108, 2019, doi: https://doi.org/10.1021/acs.jcim.8b00839.

M. M. T. G. Manik et al., “Leveraging Ai-Powered Predictive Analytics for Early Detection of Chronic Diseases: A Data-Driven Approach to Personalized Medicine,” Nanotechnol. Perceptions, vol. 17, no. 3, pp. 269–288, 2021, doi: https://doi.org/10.62441/nano-ntp.v17i3.5444.

M. M. T. G. Manik et al., “Integrating Genomic Data and Machine Learning to Advance Precision Oncology and Targeted Cancer Therapies,” Nanotechnol. Perceptions, vol. 18, no. 2, pp. 219–243, 2022, doi: https://doi.org/10.62441/nano-ntp.v18i2.5443.

M. M. T. G. Manik, “Multi-Omics Integration with Machine Learning for Early Detection of Ischemic Stroke Through Biomarkers Discovery,” J. Ecohumanism, vol. 2, no. 2, pp. 175 –187, 2023, doi: https://doi.org/10.62754/joe.v2i2.6800.

M. M. T. G. Manik, A. S. M. Saimon, M. S. Islam, M. Moniruzzaman, E. Rozario, and M. E. Hossin, “Big Data Analytics for Credit Risk Assessment,” in In 2025 International Conference on Machine Learning and Autonomous Systems (ICMLAS), Prawet, Thailand, 2025, 2025, pp. 1379–1390. doi: 10.1109/ICMLAS64557.2025.10967667.

C. R. Barikdar et al., “Life Cycle Sustainability Assessment of Bio-Based and Recycled Materials in Eco-Construction Projects,” J. Ecohumanism, vol. 1, no. 2, pp. 151–162, 2022, doi: https://doi.org/10.62754/joe.v1i2.6807.

F. B. Khair, M. K. Ahmed, S. Hossain, S. Hossain, M. M. T. G. Manik, and R. Rahman, “Sustainable Economic Growth Through Data Analytics: The Impact of Business Analytics on U.S. Energy Markets and Green Initiatives,” in 2024 International Conference on Progressive Innovations in Intelligent Systems and Data Science (ICPIDS), Pattaya, Thailand, 2024, 2024, pp. 108–113. doi: 10.1109/ICPIDS65698.2024.00026.

C. A. Lipinski, F. Lombardo, B. W. Dominy, and P. J. Feeney, “Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings,” Adv. Drug Deliv. Rev., vol. 46, no. 1–3, pp. 3–26, 2001, doi: https://doi.org/10.1016/S0169-409X(00)00129-0.

M. J. Kusner, B. Paige, and J. M. Hernández-Lobato, “Grammar variational autoencoder,” in Proceedings of the 34th International Conference on Machine Learning, 2017, vol. 70, pp. 1945–1954. doi: https://proceedings.mlr.press/v70/kusner17a.html.

J. Jumper et al., “Highly accurate protein structure prediction with AlphaFold,” Nature, vol. 596, no. 7873, pp. 583–589, 2021, doi: https://doi.org/10.1038/s41586-021-03819-2.

J. Stebbinga et al., “COVID-19: combining antiviral and anti-inflammatory treatments,” Lancet Infect. Dis., vol. 20, no. 4, pp. 400–402, 2020, doi: https://doi.org/10.1016/S1473-3099(20)30132-8.

I. J. Bulbul, Z. Zahir, A. Tanvir, and P. Alam, Parisha, “Comparative study of the antimicrobial, minimum inhibitory concentrations (MIC), cytotoxic and antioxidant activity of methanolic extract of different parts of Phyllanthus acidus (l.) Skeels (family: Euphorbiaceae),” World J. Pharm. Pharm. Sci., vol. 8, no. 1, pp. 12–57, 2018, doi: https://doi.org/10.20959/wjpps20191-10735.

A. Tanvir, J. Jo, and S. M. Park, “Targeting Glucose Metabolism: A Novel Therapeutic Approach for Parkinson’s Disease,” Cells, vol. 13, no. 22, p. 1876, 2024, doi: https://doi.org/10.3390/cells13221876.

E. Tjoa and C. Guan, “A survey on explainable artificial intelligence (XAI): Toward medical XAI,” IEEE Trans. Neural Networks Learn. Syst., vol. 32, no. 11, pp. 4793–4813, 2020, doi: https://doi.org/10.1109/TNNLS.2020.3027314.

M. S. Rahman, S. Islam, S. I. Khan, A. A. M. Ashik, E. Hossain, and M. M. Rahman, “Redefining marketing and management strategies in digital age: Adapting to consumer behavior and technological disruption,” J. Inf. Syst. Eng. Manag., vol. 9, no. 4, pp. 1–16, 2024, doi: https://doi.org/10.52783/jisem.v9i4.32.

M. M. Rahaman, M. R. Islam, M. M. R. Bhuiyan, I. R. Noman, M. M. Aziz, and K. Das, “Harnessing big data in biotechnology: A machine learning approach to multi-omics,” in In 2025 International Conference on Machine Learning and Autonomous Systems (ICMLAS), 2025, pp. 1391–1401. doi: https://doi.org/10.1109/ICMLAS64557.2025.10967731.

S. I. Khan, M. S. Rahman, A. A. M. Ashik, S. Islam, M. M. Rahman, and E. Hossain, “Big Data and Business Intelligence for Supply Chain Sustainability: Risk Mitigation and Green Optimization in the Digital Era,” Eur. J. Manag. Econ. Bus., vol. 1, no. 3, pp. 262–276, 2024, doi: https://doi.org/10.59324/ejmeb.2024.1(3).23.

D. Hossain, “Fire dynamics and heat transfer: advances in flame spread analysis,” Open Access Res J Sci Technol, vol. 6, no. 2, pp. 70–5, 2022, doi: https://doi.org/10.53022/oarjst.2022.6.2.0061.

D. Hossain, “A fire protection life safety analysis of multipurpose building,” 2021, [Online]. Available: https://digitalcommons.calpoly.edu/fpe_rpt/135/.