[1]李红春,张光亚,方柏山.采用伪氨基酸组成预测水解酶亚家族[J].华侨大学学报(自然科学版),2010,31(3):317-321.[doi:10.11830/ISSN.1000-5013.2010.03.0317]
 LI Hong-chun,ZHANG Guang-ya,FANG Bai-shan.Using Pseudo Amino Acid Composition to Predict Hydrolase Subfamily[J].Journal of Huaqiao University(Natural Science),2010,31(3):317-321.[doi:10.11830/ISSN.1000-5013.2010.03.0317]
点击复制

采用伪氨基酸组成预测水解酶亚家族()
分享到:

《华侨大学学报(自然科学版)》[ISSN:1000-5013/CN:35-1079/N]

卷:
第31卷
期数:
2010年第3期
页码:
317-321
栏目:
出版日期:
2010-05-20

文章信息/Info

Title:
Using Pseudo Amino Acid Composition to Predict Hydrolase Subfamily
文章编号:
1000-5013(2010)03-0317-05
作者:
李红春张光亚方柏山
华侨大学工业生物技术研究所
Author(s):
LI Hong-chun ZHANG Guang-ya FANG Bai-shan
Institute of Industrial Biotechnology, Huaqiao University, Quanzhou 362021, China
关键词:
水解酶亚家族 特征值 伪氨基酸 k-近邻
Keywords:
hydrolase subfamily feature extraction pseudo amino acid composition k-nearest neighbor
分类号:
Q811.4
DOI:
10.11830/ISSN.1000-5013.2010.03.0317
文献标志码:
A
摘要:
利用伪氨基酸组成提取蛋白序列特征值,考察参数λ和w对识别效果的影响,以k-近邻作为基础分类器,用于预测水解酶的亚家族类型.结果表明,伪氨基酸组成特征提取法与单纯的20个氨基酸组成特征方法相比,其识别精度有较大程度提高.20AA组成的平均预测精度为72.3%,而伪氨基酸组成特征提取的识别效果可达82.7%.在参数影响考察方面,自相关性函数个数的选取对识别效果影响较大,而权重因子w对识别效果影响则很小.
Abstract:
Predicting the hydrolase subfamily is of great importance for designing a fast and reliable classification system.In this paper,the pseudo amino acid composition method was used to extract the features from protein sequencec,and the k-nearest neighbor algorithm was used as the classifier to predict the hydrolase subfamily.The influences of λ and ω on prediction accuracy were also studied.The results showed that the prediction accuracy of pseudo amino acid composition were much higher(about 10.4%) than that of amino acid composition,the prediction accuracy of amino acid was 72.3%,while the pseudo amino acid was 87.2%.The running parameter of λ had more influence on prediction accuracy when compared with ω.

参考文献/References:

[1] CHOU K C. Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes [J]. BIOINFORMATICS, 2005(1):10-19.doi:10.1093/bioinformatics/bth466.
[2] BAIROCH A, APWEILER R, WU C H. The universal protein resource (uniprot) [J]. Nucleic Acids Research, 2005():154-159.
[3] BAIROCH A, APWEILER R. The SWISS-PROT protein sequence data bank and its supplement TrEMBL [J]. Nucleic Acids Research, 2000(1):31-36.doi:10.1093/nar/25.1.31.
[4] 段作纬, 熊郁良, 陈训如. 用胰蛋白酶和糜蛋白酶急救药盒救治毒蛇咬伤256例 [J]. 中国危重病急救医学, 1998(6):336-338.
[5] VIIKARI L, PAUNA M, KANTELINEN A. Bleaching with enzymes with enzymes [A]. Stockholm:Swedish Pulp and Paper Research Institute, 1986.67-69.
[6] CLASSEN H L. Cereal grain starch and exogenous enzymes in poultry diets [J]. Animal Feed Science and Technology, 1996(1):21-27.doi:10.1016/S0377-8401(96)01002-4.
[7] 丁彦蕊, 蔡宇杰, 须文波. 基于氨基酸组成预测蛋白质热稳定性的v-支持向量机方法 [J]. 计算机与应用化学, 2005(6):51-57.doi:10.3969/j.issn.1001-4160.2005.06.006.
[8] GROMIHA M M, AHMAD S, SUWA M. Application of residue distribution along the sequence for discriminating outer membrane proteins [J]. Computational Biology and Chemistry, 2005(2):135-142.doi:10.1016/j.compbiolchem.2005.02.006.
[9] CHOU K C. Prediction of protein cellular attributes using pseudo-amino-acid-composition [J]. Structure Function and Genetics, 2001(3):246-255.doi:10.1002/prot.1035.
[10] BAIROCH A. The ENZYME database in 2000 [J]. Nucleic Acids Research, 2000(1):304-305.
[11] ALTSCHUL S F, MADDEN T L, SCHAFFER A A. Gapped BLAST and PSI-BLAST:A new generation of protein database search programs [J]. Nucleic Acids Research, 1997, (17):3389-3402.
[12] CHOU K C. Prediction of protein cellular attributes using pseudo-amino-acid-composition [J]. Proteins:Structure, Function, and Genetics, 2001(3):246-255.doi:10.1002/prot.1035.
[13] TANFORD C. Contribution of hydrophobic interactions to the stability of the globular conformation of proteins [J]. Journal of the American Chemical Society, 1962, (22):4240-4247.doi:10.1021/ja00881a009.
[14] HOPP T P, WOODS K R. Prediction of protein antigenic determinants from amino acid sequences [J]. Proceedings of the National Academy of Sciences(USA), 1981(6):3824-3828.doi:10.1073/pnas.78.6.3824.
[15] INAMDAR N M, EHRLICH K C, EHRLICH M. Data mining in bioinformatics using Weka [J]. Bioinformatics, 2004, (15):2479-2481.doi:10.1093/bioinformatics/bth261.
[16] COVER T M, HART P E. Nearest neighbor pattern classification [J]. IEEE Transactions on Information theory, 1967(1):21-27.
[17] CUI J, HAN L Y, LIN H H. Prediction of MHC-binding peptides of flexible lengths from sequence-derived structural and physicochemical properties [J]. Molecular Immunology, 2007(5):866-877.doi:10.1016/j.molimm.2006.04.001.
[18] 张光亚, 葛慧华, 方柏山. 采用BP算法的多层感知机模型的蛋白识别 [J]. 华侨大学学报(自然科学版), 2009(2):161-165.
[19] GUO Yan-zhi, LI Meng-long, ZOU Xiao-yong. Fast fourier transform-based support vector machine for prediction of G-protein coupled receptor subfamilies [J]. ACTA BIOCHIMICA ET BIOPHYSICA SINICA, 2005, (11):759-766.
[20] QIU Jian-ding, LIANG Ru-ping, ZOU Xiao-yong. Prediction of protein secondary structure based on continuous wavelet transform [J]. Talanta, 2003(3):285-293.doi:10.1016/S0039-9140(03)00278-9.

相似文献/References:

[1]曾文平.三层对称差分格式的稳定性[J].华侨大学学报(自然科学版),1986,7(3):231.[doi:10.11830/ISSN.1000-5013.1986.03.0231]
 Zeng Wenping.The Stability of Three Layer Symmetric Difference Schemes[J].Journal of Huaqiao University(Natural Science),1986,7(3):231.[doi:10.11830/ISSN.1000-5013.1986.03.0231]
[2]王子丁.离散方程组的解不出负的条件[J].华侨大学学报(自然科学版),1993,14(1):8.[doi:10.11830/ISSN.1000-5013.1993.01.0008]
 Wang Ziding.The Nonnegativity Condition of the Solution of Discrete Equations[J].Journal of Huaqiao University(Natural Science),1993,14(3):8.[doi:10.11830/ISSN.1000-5013.1993.01.0008]
[3]欧阳煜.空间框架体系弹性稳定分析[J].华侨大学学报(自然科学版),1994,15(3):293.[doi:10.11830/ISSN.1000-5013.1994.03.0293]
 Ouyang Yu.Elastic Stability Analysis of Space Frame Structures[J].Journal of Huaqiao University(Natural Science),1994,15(3):293.[doi:10.11830/ISSN.1000-5013.1994.03.0293]
[4]田朝薇,宋海洲.正矩阵最大特征值界的新估计[J].华侨大学学报(自然科学版),2009,30(2):237.[doi:10.11830/ISSN.1000-5013.2009.02.0237]
 TIAN Zhao-wei,SONG Hai-zhou.New Estimation for the Bounds of the Greatest Characteristic Root of a Positive Matrix[J].Journal of Huaqiao University(Natural Science),2009,30(3):237.[doi:10.11830/ISSN.1000-5013.2009.02.0237]

备注/Memo

备注/Memo:
国家自然科学基金资助项目(30770059); 教育部博士点科研基金资助项目(20070685001)
更新日期/Last Update: 2014-03-23