TY - JOUR
T1 - A Complementary Bioinformatics Approach to Identify Potential Plant Cell Wall Glycosyltransferase-Encoding Genes
AU - Egelund, Jack
AU - Skjøt, Michael
AU - Geshi, Naomi
AU - Ulvskov, Peter
AU - Petersen, Bent Larsen
N1 - Keywords: Amino Acid Motifs; Amino Acid Sequence; Arabidopsis; Arabidopsis Proteins; Cell Wall; Computational Biology; Databases, Protein; Genes, Plant; Glycosyltransferases; Molecular Sequence Data; Phylogeny; Proteome; Sequence Homology, Amino Acid
PY - 2004
Y1 - 2004
N2 - Plant cell wall (CW) synthesizing enzymes can be divided into the glycan (i.e. cellulose and callose) synthases, which are multimembrane spanning proteins located at the plasma membrane, and the glycosyltransferases (GTs), which are Golgi localized single membrane spanning proteins, believed to participate in the synthesis of hemicellulose, pectin, mannans, and various glycoproteins. At the Carbohydrate-Active enZYmes (CAZy) database where e.g. glucoside hydrolases and GTs are classified into gene families primarily based on amino acid sequence similarities, 415 Arabidopsis GTs have been classified. Although much is known with regard to composition and fine structures of the plant CW, only a handful of CW biosynthetic GT genes-all classified in the CAZy system-have been characterized. In an effort to identify CW GTs that have not yet been classified in the CAZy database, a simple bioinformatics approach was adopted. First, the entire Arabidopsis proteome was run through the Transmembrane Hidden Markov Model 2.0 server and proteins containing one or, more rarely, two transmembrane domains within the N-terminal 150 amino acids were collected. Second, these sequences were submitted to the SUPERFAMILY prediction server, and sequences that were predicted to belong to the superfamilies NDP-sugartransferase, UDP-glycosyltransferase/glucogen-phosphorylase, carbohydrate-binding domain, Gal-binding domain, or Rossman fold were collected, yielding a total of 191 sequences. Fifty-two accessions already classified in CAZy were discarded. The resulting 139 sequences were then analyzed using the Three-Dimensional-Position-Specific Scoring Matrix and mGenTHREADER servers, and 27 sequences with similarity to either the GT-A or the GT-B fold were obtained. Proof of concept of the present approach has to some extent been provided by our recent demonstration that two members of this pool of 27 non-CAZy-classified putative GTs are xylosyltransferases involved in synthesis of pectin rhamnogalacturonan II (J. Egelund, B.L. Petersen, A. Faik, M.S. Motawia, C.E. Olsen, T. Ishii, H. Clausen, P. Ulvskov, and N. Geshi, unpublished data).
AB - Plant cell wall (CW) synthesizing enzymes can be divided into the glycan (i.e. cellulose and callose) synthases, which are multimembrane spanning proteins located at the plasma membrane, and the glycosyltransferases (GTs), which are Golgi localized single membrane spanning proteins, believed to participate in the synthesis of hemicellulose, pectin, mannans, and various glycoproteins. At the Carbohydrate-Active enZYmes (CAZy) database where e.g. glucoside hydrolases and GTs are classified into gene families primarily based on amino acid sequence similarities, 415 Arabidopsis GTs have been classified. Although much is known with regard to composition and fine structures of the plant CW, only a handful of CW biosynthetic GT genes-all classified in the CAZy system-have been characterized. In an effort to identify CW GTs that have not yet been classified in the CAZy database, a simple bioinformatics approach was adopted. First, the entire Arabidopsis proteome was run through the Transmembrane Hidden Markov Model 2.0 server and proteins containing one or, more rarely, two transmembrane domains within the N-terminal 150 amino acids were collected. Second, these sequences were submitted to the SUPERFAMILY prediction server, and sequences that were predicted to belong to the superfamilies NDP-sugartransferase, UDP-glycosyltransferase/glucogen-phosphorylase, carbohydrate-binding domain, Gal-binding domain, or Rossman fold were collected, yielding a total of 191 sequences. Fifty-two accessions already classified in CAZy were discarded. The resulting 139 sequences were then analyzed using the Three-Dimensional-Position-Specific Scoring Matrix and mGenTHREADER servers, and 27 sequences with similarity to either the GT-A or the GT-B fold were obtained. Proof of concept of the present approach has to some extent been provided by our recent demonstration that two members of this pool of 27 non-CAZy-classified putative GTs are xylosyltransferases involved in synthesis of pectin rhamnogalacturonan II (J. Egelund, B.L. Petersen, A. Faik, M.S. Motawia, C.E. Olsen, T. Ishii, H. Clausen, P. Ulvskov, and N. Geshi, unpublished data).
U2 - 10.1104/pp.104.042978
DO - 10.1104/pp.104.042978
M3 - Journal article
C2 - 15333752
SN - 0032-0889
VL - 136
SP - 2609
EP - 2620
JO - Plant Physiology
JF - Plant Physiology
IS - 1
ER -