Predicting protein secondary structure based on ensemble Neural Network

  • Emmanuel Gbenga Dada Department of Mathematical Sciences, Faculty of Science, University of Maiduguri, Maiduguri, Nigeria http://orcid.org/0000-0002-1132-5447
  • David Opeoluwa Oyewola Department of Mathematics and Computer Science, Federal University Kashere, Gombe, Nigeria http://orcid.org/0000-0001-9638-8764
  • Joseph Hurcha Yakubu Department of Mathematical Sciences, University of Maiduguri, Maiduguri, Nigeria http://orcid.org/0000-0003-2168-9882
  • Ayotunde Alaba Fadele Department of Computer Science, Federal College of Education, Zaria, Kaduna State, Nigeria http://orcid.org/0000-0002-1125-0780

Abstract

Protein structure prediction is very vital to innovative process of discovering new medications based on the knowledge of a biological target. It is also useful for scientifically exposing the biological basis of convoluted diseases and drug effects. Despite its usefulness, protein structure is very complex, thereby making its prediction to be arduous, timewasting and costly. These drawbacks necessitated the need to develop more effective techniques with high prediction capability. Conventional techniques for predicting protein structure are ineffective, perform poorly, expensive and slow. The reasons for these are due to the vague dissimilar sequences among protein structures, meaningless protein data, high dimensional data, and having to deal with highly imbalanced classification task.  We proposed an Ensemble Neural Network learning model that consists of some Neural Network algorithms such as Feed Forward Neural Network (FFNN), Recurrent Neural Network (RNN), Cascade Forward  Network (CFN) and Non-linear Autoregressive Network with Exogenous (NARX) models. These models were trained using training algorithms such as Levenberg-Marquardt (LM), Resilient Back Propagation (RBP) and Scaled Conjugate Gradient (SCG) to improve the performance. Experimental results show that our proposed model has superior performance compared to the other models compared.

Downloads

Download data is not yet available.

Author Biography

Emmanuel Gbenga Dada, Department of Mathematical Sciences, Faculty of Science, University of Maiduguri, Maiduguri, Nigeria

Emmanuel Gbenga DADA received his Ph.D. in Computer Science from Universiti Malaya, Malaysia (UM), MSc in Computer Science from the University of Ibadan, Ibadan (UI), Nigeria, and a Bachelor of Technology in Computer Science from the University of Ilorin, Ilorin, Nigeria. His current research interests are in Softcomputing Techniques, Machine Learning Algorithms, Image Segmentation, Swarm Robotics, Cyber Security, and Big Data. He has published several academic papers in reputable International and local journals, conference proceedings, and book chapters. He has been appointed as a reviewer of several ISI and Scopus indexed International journals such as ACM Survey, IEEE Access, Lecture Notes in Computational Vision, and Biomechanics. He is a member of IEEE, International Society for Knowledge Organization (ISOK-WA), International Association of Engineers (IAENG), and Computer Professionals Registration Council of Nigeria (CPN). He is now an Associate Professor of Soft Computing and Computer Science at the Department of Mathematical Sciences (Computer Science Unit), University of Maiduguri, Nigeria.

References

H. Jeong, S.P. Mason, A.L. Barabási and , Z.N. Oltvai. “Lethality and centrality in protein networks”. Nature, 411(6833), 2001, pp.41-42.

A. Szilagyi, V. Grimm, A.K. Arakaki and J. Skolnick. “Prediction of physical protein–protein interactions.” Physical biology, 2(2), 2005, p.S1.

J. Karanicolas, and B. Kuhlman, “Computational design of affinity and specificity at protein–protein interfaces,” Curr. Opin. Struct. Biol. 19 (2009), 458–463.

A. Chevalier, D.A. Silva, G.J. Rocklin, D.R. Hicks, R. Vergara, P. Murapa, S.M. Bernard, L. Zhang, K.H. Lam, G. Yao and C.D. Bahl, “Massively parallel de novo protein design for targeted therapeutics.” Nature, 550 (7674), 2017, pp.74-79.

G. Grigoryan, A.W. Reinke, A.E. Keating, “Design of protein interaction specificity gives selective bZIP-binding peptides”, Nature 458, 2009, 859–864.

S.J. Fleishman, T.A. Whitehead, D.C. Ekiert, C. Dreyfus, J.E. Corn, E.M. Strauch, I.A. Wilson and D. Baker. “Computational design of proteins targeting the conserved stem region of influenza hemagglutinin”. Science, 332(6031), 2011, pp.816-821.

N.P. King, W. Sheffler, M.R. Sawaya, B.S. Vollmar, J.P. Sumida, I. André, T. Gonen, T.O. Yeates and D. Baker. “Computational design of self-assembling protein nanomaterials with atomic level accuracy”. Science, 336(6085), 2012, pp.1171-1174.

D. Shultis, P. Mitra, X. Huang, J. Johnson, N.A. Khattak, F. Gray, C. Piper, J. Czajka, L. Hansen, B. Wan and K. Chinnaswamy. “Changing the apoptosis pathway through evolutionary protein design”. Journal of molecular biology, 431(4), 2019, pp.825-841.

H. N. Lin, T.Y. Sung, S.Y. Ho and W.L. Hsu, “Improving protein secondary structure prediction based on short subsequences with local structure similarity”, Bmc Genomics, vol. 11, 2010, p. S4 BioMed Central.

Y. Chen, “Long sequence feature extraction based on deep learning neural network for protein secondary structure prediction”, Information Technology and Mechatronics Engineering Conference (ITOEC), IEEE, 2017, pp. 843–847 2017 IEEE 3rd.

Z. Li, J. Wang, S. Zhang, Q. Zhang, and W. Wu, “A new hybrid coding for protein secondary structure prediction based on primary structure similarity”, Gene, 618, 2017, pp. 8–13.

T. Liu, X. Zheng and J. Wang, “Prediction of protein structural class using a complexity based distance measure”, Amino Acids, 38 (3), 2010, pp. 721–728.

Y. Liu, J. Cheng, Y. Ma and Y. Chen, “Protein secondary structure prediction based on two dimensional deep convolutional neural networks”, 3rd IEEE International Conference on Computer and Communications (ICCC), IEEE, 2017, pp. 1995–1999.

E. Krissinel, “On the relationship between sequence and structure similarities in proteomics”, Bioinformatics, 23 (6), 2007, pp. 717–723.

S. Wang, J. Peng, J. Ma and J. Xu, “Protein secondary structure prediction using deep convolutional neural fields”, Scientific reports, 6(1), 2016, pp.1-11.

M. Alirezaee, A. Dehzangi and E. Mansoori, “Ensemble of neural networks to solve class imbalance problem of protein secondary structure prediction”, International Journal of Artificial Intelligence & Applications, 3 (6), 2012, pp. 9.

E.G. Dada, J.S. Bassi, H., Chiroma, M.A. Shafi’i, A.O. Adetunmbi, and O.E. Ajibuwa, “Machine learning for email spam filtering: review, approaches and open research problems”. Heliyon, 5(6), 2019, p.e01802.

S. Babaei, A. Geranmayeh, S.A. Seyyedsalehi, “Towards designing modular recurrent neural networks in learning protein secondary structures”, Expert Syst. Appl. 39 (6), 2012, pp. 6263–6274

F. Masulli, S. Mitra, “Natural computing methods in bioinformatics: a survey”, Inf. Fusion 10 (3), 2009, pp. 211–216.

G. Pollastri, A.J.M. Martin, C. Mooney, A. Vullo, “Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information”, BMC Bioinf., 8 (1), 2007, p. 201.

S. Hosseini, H. Yin, N.M. Cheung, K.P. Leng, Y. Elovici and X. Zhou. “Exploiting reshaping subgraphs from bilateral propagation graphs.” In International Conference on Database Systems for Advanced Applications, 2018, (pp. 342-351). Springer, Cham.

S. Hosseini, H. Yin, M. Zhang, X. Zhou and S. Sadiq, “Jointly modeling heterogeneous temporal properties in location recommendation”, International Conference on Database Systems for Advanced Applications, Springer, 2017, pp. 490–506.

S. Hosseini, H. Yin, X. Zhou, S. Sadiq, M.R. Kangavari, N.M. Cheung, “Mining subgraphs from propagation networks through temporal dynamic analysis.” In 2018 19th IEEE International Conference on Mobile Data Management (MDM) 2018, (pp. 66-75). IEEE.

K. Paliwal, J. Lyons, R. Heffernan, “A short review of deep learning neural networks in protein structure prediction problems”, Advanced Techniques in Biology & Medicine, 2015, pp. 1–2.

J. Zhou and O.G. Troyanskaya “Deep supervised and convolutional generative stochastic network for protein secondary structure prediction.” 2014, arXiv preprint arXiv:1403.1347.

L. Khalatbari and M.R. Kangavari, “Protein secondary structure prediction: a literature review with focus on machine learning approaches.” Journal of Computer & Robotics, 8(1), 2015, pp.9-24.

G. Wang, Y. Zhao, and D. Wang, “A protein secondary structure prediction framework based on the extreme learning machine”, Neurocomputing 72 (1–3), 2008, pp. 262–268.

M. Spencer, J. Eickholt and J. Cheng, “A deep learning network approach to ab initio protein secondary structure prediction”, IEEE ACM Trans. Comput. Biol. Bioinform 12 (1), 2015, pp. 103–112.

K., M. Hornik Stinchcombe and H. White. “Multilayer feedforward networks are universal approximators.” Neural Networks, 2(5), 1989, pp. 359-366.

M. Lashkarbolooki and Z.S. Shafipour, “Trainable cascade-forward back-propagation network modeling of spearmint oil extraction in a packed bed using SC-CO2”. The Journal of Supercritical Fluids, 73, 2012, pp. 108-115.

U.B. Filik and M. Kurban. “A new approach for the short-term load forecasting with autoregressive and artificial neural network models.” International Journal of Computational Intelligence Research, 3(1), 2007, pp. 66-71.

H. Demuth, M. Beale and M. Hagan. “Neural Network Toolbox User‟s Guide”. The MathWorks, Inc., Natrick, USA. 2009.

M. Lashkarbolooki, B. Vaferi, A. Shariati, and A.Z. Hezave. “Investigating vapor–liquid equilibria of binary mixtures containing supercritical or near-critical carbon dioxide and a cyclic compound using cascade neural network”. Fluid Phase Equilibria, 343, 2013, pp.24-29.

B.K. Chauhan, A. Sharma and M. Hanmandlu. “August. Neuro-Fuzzy Approach Based Short Term Electric Load Forecastig”. In 2005 IEEE/PES Transmission & Distribution Conference & Exposition: Asia and Pacific, 2005, (pp. 1-5). IEEE.

H. Demuth, M. Beale and M.Hagan (2009). “Neural Network Toolbox User‟s Guide”. The MathWorks, Inc., Natrick, USA. 2009.

J.L. Elman “Finding structure in time”. Cognitive Science, 14(2), 1990, pp. 179-211.

S. Haykin. “Neural Networks: A Comprehensive Foundation”. 2nd ed. New Jersey: Prentice Hall., 1999, p. 823.

H. Jia “Investigation into the effectiveness of long short term memory networks for stock price prediction”. arXiv preprint arXiv , 2016, 1603.07893

M. Awad and M. Foqaha. “Email Spam Classification Using Hybrid Approach of RBF Neural Network and Particle Swarm Optimization.” International Journal of Network Security and its Applications, Vol. 8, no. 4, 2016, pp. 17-28.

O.A. Carpinteiro, I. Lima, J.M. Assis, A.C.Z. de Souza, E.M. Moreira, and C.A. Pinheiro, “A neural model in anti-spam systems”. In International Conference on Artificial Neural Networks, 2006, (pp. 847-855). Springer, Berlin, Heidelberg.

S. Mohanty, P.K. Patra, and S.S. Sahoo. “Prediction of global solar radiation using nonlinear autoregressive network win exogenous inputs (narx)”. In Proceedings of the 39th National System Conference (NSC), Noida, India, 14–16 December 2015.

E. Pisoni, M. Farina, C. Carnevale, and L. Piroddi, “Forecasting peak air pollution levels using NARX models.” Eng. Appl. Artif. Intell. 2009, 22, pp. 593–602.

L.G.B. Ruiz, M.P. Cuéllar, M.D. Calvo-Flores, and M.D.C.P. Jiménez, “An Application of Non-Linear Autoregressive Neural Networks to Predict Energy Consumption in Public Buildings”. Energies, 2016, 9, 684.

K. Levenberg, “A Method for the Solution of Certain Non-Linear Problems in Least Squares.” The Quarterly of Applied Mathematics. 2, 1944, pp. 164-168.

D. Marquardt, “An algorithm for least-squares estimation of nonlinear parameters”, SIAM J. Appl. Math., Vol. 11, 1963, pp. 431–441.

M. Riedmiller, “Untersuchungen zu konvergenz und generalisierungsverhalten uberwachter lernverfahren mit dem SNNS”. In Proceedings of the SNNS 1993 workshop.

M.F. Møller, A scaled conjugate gradient algorithm for fast supervised learning. Aarhus University, 1990, Computer Science Department.

C. Tran, A. Abraham and L. Jain “Decision support systems using hybrid neurocomputing”. Neurocomputing, 61, 2004, pp.85-97.

M. Riedmiller and H. Braun. “Rprop: A fast adaptive learning algorithm.” In Proc. of the Int. Symposium on Computer and Information Science VII. 1992.

J.T. Wei, Z. Zhang, S.D. Barnhill, K.R. Madyastha, H. Zhang, and J.E. Oesterling. “Understanding artificial neural networks and exploring their potential applications for the practicing urologist”. Urology, 52(2), 1998, pp.161-172.

Published
2021-02-15
How to Cite
Dada, E., Oyewola, D., Yakubu, J., & Fadele, A. (2021). Predicting protein secondary structure based on ensemble Neural Network. ITEGAM-JETIA, 7(27), 49-56. https://doi.org/10.5935/jetia.v7i27.732
Section
Articles