Predicting Output Associated with Biological Sequences across Multiple Species Using Deep Learning and Large Language Models
Analyses performed on biological sequences such as DNA, RNA, and protein are crucial in bioinformatics and genomics research, as well as in predicting associated outcomes.
Large language models have gained attention for their success in natural language processing (NLP), and investigating their prediction performance on biological sequences could introduce a significant innovation in bioinformatics.
Our project will focus on predicting changes in different biological sequences and their specific binding properties using deep learning techniques from natural language processing. By leveraging deep learning and NLP models, we aim to identify biological modifications more effectively and improve prediction performance compared to previous studies.
We will also integrate sequence-derived features with the 3D structures of the corresponding protein/RNA/DNA using deep learning methods, further enhancing performance.







