TY - GEN
T1 - Clustering binary codes to express the biochemical properties of amino acids
AU - Fu, Huaiguo
AU - Mephu Nguifo, Engelbert
PY - 2005
Y1 - 2005
N2 - We study four kinds of binary codes of amino acids (AA). Two codes of them are based respectively on biochemical properties, and the two others are generated with artificial intelligence (AI) methods, and are based on protein structures and alignment, and on Dayhoff matrix. In order to give a global significance of each binary code, we use a hierarchical clustering method to generate different clusters of each binary codes of amino acids. Each cluster is examined with biochemical properties to give an explanation on the similarity between amino acids that it contains. To validate our examination, a decision tree based machine learning system is used to characterize the AA clusters obtained with each binary codes. From this experimentation, it comes out that one of the AI based codes allows to obtain clusters that have significant biochemical properties. As a consequence, it appears that even if attributes of binary codes generated with AI methods, do not separately correspond to a biochemical property, they can be significant in the whole. Conversely binary codes based on biochemical properties can be insignificant when forming a whole.
AB - We study four kinds of binary codes of amino acids (AA). Two codes of them are based respectively on biochemical properties, and the two others are generated with artificial intelligence (AI) methods, and are based on protein structures and alignment, and on Dayhoff matrix. In order to give a global significance of each binary code, we use a hierarchical clustering method to generate different clusters of each binary codes of amino acids. Each cluster is examined with biochemical properties to give an explanation on the similarity between amino acids that it contains. To validate our examination, a decision tree based machine learning system is used to characterize the AA clusters obtained with each binary codes. From this experimentation, it comes out that one of the AI based codes allows to obtain clusters that have significant biochemical properties. As a consequence, it appears that even if attributes of binary codes generated with AI methods, do not separately correspond to a biochemical property, they can be significant in the whole. Conversely binary codes based on biochemical properties can be insignificant when forming a whole.
KW - Amino acids
KW - Bioinformatics and AI
KW - Classification
KW - Clustering
UR - http://www.scopus.com/inward/record.url?scp=84902458181&partnerID=8YFLogxK
U2 - 10.1007/0-387-23152-8_36
DO - 10.1007/0-387-23152-8_36
M3 - Conference contribution
AN - SCOPUS:84902458181
SN - 038723151X
SN - 9780387231518
T3 - IFIP Advances in Information and Communication Technology
SP - 279
EP - 282
BT - Intelligent Information Processing II - IFIP TC12/WG12.3 International Conference on Intelligent Information Processing, IIP 2004
PB - Springer
Y2 - 21 October 2004 through 23 October 2004
ER -