Soybean Full-Length cDNA Database

About RIKEN Soybean Full-Length cDNA Database
Soybean (Glycine max (L.) Merr.) is one of the most important crops in the world. The agronomical importance of soybean has been steadily increasing because it is an important source for protein and vegetable oil for human and animal nutrition. In addition, soybean serves as a valuable renewable agricultural source for industrial products, e.g. lubricating oil, printing ink or biodiesel. Soybean has a large size of genome (1,115 Mbp) with 2n=40 in which a complex genome duplication events were involved. It was suggested that at least one of the original genomes was duplicated prior to the most recent polyploidization event in soybean. Thus, the size and complexity of the soybean genome makes it difficult to assemble a whole-genome sequence. Similar to other species which lack completely sequenced genomes, the catalog of gene transcripts in soybean can be obtained through the analysis of soybean cDNAs.
In addition to such EST projects, full-length cDNA collections are regarded as an important resource for post-genomic research, and have therefore already been performed in many organisms. Several techniques have been established to prepare full-length cDNA enriched libraries from various organisms. In plants, full-length cDNAs have also been collected from Arabidopsis, rice, poplar, wheat, or maize. A major advantage of this approach is that the most of clones contain the complete coding sequences as well as the 5f and 3f untranslated regions (UTRs). Inclusion of the entire sequence data dramatically facilitates the subsequent sequencing, annotation, and protein expression and other functional assays. Furthermore, a large collection of full-length sequences of cDNA clones also provide a set of protein sequences allowing us to estimate gene functions by searching homology to other proteins, conserved domains or motifs. Full-length cDNAs are also useful to develop molecular markers using their sequence information.
Our collection was supported by grants from the National Bioresource Project (NBRP http://www.nbrp.jp/report/reportProject.jsp?project=beans) for Lotus/Glycine, JIRCAS Comprehensive Research Project (eComprehensive studies on development of sustainable soybean production technology in South Americaf), the Grant-in-Aid for Scientific Research (17018005, 18017004 and 18700106) from MEXT, and RIKEN Plant Science Center. This database includes approximately 4000 full reading sequences that was determined by NBRP.

The resources of the soybean full-length cDNA clones will be distributed from the National Bioresource Project for Lotus/Glycine in Japan (http://www.legumebase.agr.miyazaki-u.ac.jp/).
BLAST Search
Records in this database were obtained from the following datasets.
  • nucleotide
    • Glycine max cDNA (RIKEN)
    • Glycine max (dbEST; 2007.7.17)
    • Lotus japonicus (dbEST; 2007.7.17)
    • Medicago truncatula (dbEST; 2007.7.17)
  • peptide
    • Arabidopsis thaliana (TAIR7)
    • Populus trichocarpa (JGI v1.0)
    • Oryza sativa (RAP-DB release1)
    • Plant Proteins (TAIR 2007.6.11)
    • UniProt-TrEMBL plants (2007.5.15)

These are available for BLAST search.

Keyword Search
Search for gene information using arbitrary keywords (ex. WRKY), Soybean TU ID, scaffold ID, contig ID and clone ID (ex. GMFL02-09-F14) in the database of cDNAs.
Related Publication
Umezawa T., Sakurai T. and et al (2008) Sequencing and Analysis of Approximately 40 000 Soybean cDNA Clones from a Full-Length-Enriched cDNA Library. DNA Research, doi:10.1093/dnares/dsn024, [Pubmed]

Copyright © 2007 RIKEN Plant Science Center | Disclaimer | Contact |