[1]张光亚,葛慧华,方柏山.采用BP算法的多层感知机模型的蛋白识别[J].华侨大学学报(自然科学版),2009,30(2):161-165.[doi:10.11830/ISSN.1000-5013.2009.02.0161]
 ZHANG Guang-ya,GE Hui-hua,FANG Bai-shan.Application of a BP Algorithm Based Multi-Layer Perceptron Model to Discriminate Thermophilic and Mesophilic Proteins[J].Journal of Huaqiao University(Natural Science),2009,30(2):161-165.[doi:10.11830/ISSN.1000-5013.2009.02.0161]
点击复制

采用BP算法的多层感知机模型的蛋白识别()
分享到:

《华侨大学学报(自然科学版)》[ISSN:1000-5013/CN:35-1079/N]

卷:
第30卷
期数:
2009年第2期
页码:
161-165
栏目:
出版日期:
2009-03-20

文章信息/Info

Title:
Application of a BP Algorithm Based Multi-Layer Perceptron Model to Discriminate Thermophilic and Mesophilic Proteins
文章编号:
1000-5013(2009)02-0161-05
作者:
张光亚葛慧华方柏山
华侨大学工业生物技术研究所
Author(s):
ZHANG Guang-ya GE Hui-hua FANG Bai-shan
Institute of Industrial Biotechnology, Huaqiao University, Quanzhou 362021, China
关键词:
BP算法 多层感知机 模式识别 蛋白质 热稳定性
Keywords:
back-propagation algorithm multi-layer perceptron pattern recognition thermostability
分类号:
Q51
DOI:
10.11830/ISSN.1000-5013.2009.02.0161
文献标志码:
A
摘要:
采用误差反传(BP)算法的多层感知机模型,对嗜热蛋白和常温蛋白进行模式识别.通过增加训练数据及多种检验方法检验模型稳定性及泛化能力,探讨蛋白分子大小对识别效果影响.结果表明,当动态参数为0.2,学习速率为0.5,隐含层节点数为11时,该模型在自一致性检验、交叉验证和独立样本测试3种检验方法中的识别精度分别为91.5%,88.2%和92.1%,其表现优于一些常见的模式识别算法,且具有良好的稳定性及泛化能力.此外,对于较大的或者中等大小蛋白质分子,其识别的精度都较高; 而对于较小的蛋白分子,其识别效果较差.
Abstract:
In this paper,a back-propagation(BP) algorithm based multi-layer perceptron model was proposed to discriminate thermophilic and mesophilic proteins.When the momentum parameter,learning rate and the number of the hidden layer nodes were 0.2,0.5 and 11,respectively,the model had the best performance.The success rate for self-consistency check,cross-validation and independent test with other dataset was 91.5%,88.2% and 92.1%,respectively.It outperformed other pattern recognition methods such as K-nearest neighbors,Naive Bayes and RBF neural network.The model was robust and has good generalization.The influence of protein size on prediction accuracy was also addressed.For big and moderate protein,the prediction accuracy was hjgh,whereas for small protein,it was low.

参考文献/References:

[1] ATOMI H. Recent progress towards the application of hyperthermophiles and their enzymes [J]. Current Opinion in Chemical Biology, 2005, (2):1-8.doi:10.1016/j.cbpa.2005.02.013.
[2] KUMAR S, NUSSINOV R. How do thermophilic proteins deal with heat? [J]. Cellular and Molecular Life Sciences, 2001.1216-1233.doi:10.1007/PL00000935.
[3] THOMPSON M J, EISENBERG D. Transproteomic evidence of a loop-deletion mechanism for enhancing protein thermostability [J]. Journal of Molecular Biology, 1999, (2):595-604.doi:10.1006/jmbi.1999.2889.
[4] 丁彦蕊, 蔡宇杰, 须文波. 基于氨基酸组成预测蛋白质热稳定性的v-支持向量机方法(英文) [J]. 计算机与应用化学, 2005(6):51-57.doi:10.3969/j.issn.1001-4160.2005.06.006.
[5] 张光亚, 刘桂兰, 方柏山. 基于支持向量机识别嗜热和常温蛋白的研究 [J]. 计算机与应用化学, 2006(8):707-710.doi:10.3969/j.issn.1001-4160.2006.08.005.
[6] 方宁, 李景治, 贺贵明. 简化的广义多层感知机模型及其学习算法 [J]. 计算机工程, 2004, (1):50-52.doi:10.3969/j.issn.1000-3428.2004.01.019.
[7] PARK K J, KANEHISA M. Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs [J]. BIOINFORMATICS, 2003, (13):1656-1663.doi:10.1093/bioinformatics/btg222.
[8] DAVID J L, GREGORY A S, DONAL A H. Synonymous codon usage is subject to selection in thermophilic bacteria [J]. Nucleic Acids Research, 2002, (19):4272-4277.doi:10.1093/nar/gkf546.
[9] ZHANG Guang-ya, FANG Bai-shan. Discrimination of thermophilic and mesophilic proteins via pattern recognition methods [J]. Process Biochemistry, 2006, (3):552-556.doi:10.1016/j.procbio.2005.09.003.
[10] INAMDAR N M, EHRLICH K C, EHRLICH M. Data mining in bioinformatics using Weka [J]. Bioinformatics, 2004, (15):2479-2481.doi:10.1093/bioinformatics/bth261.

备注/Memo

备注/Memo:
国务院侨办科研基金资助项目(05Q0018)
更新日期/Last Update: 2014-03-23