Much more information can be obtained by determining the complete sequence of nucleotides in a viral genome by either the Maxain-Gilbert method or the dideoxy method of Sanger. Open reading frames (ORFs) are translatable sequences starting with the codon for methionine (AUG) and uninterrupted by stop codons (UAA, UAG, UGA). The function of (he predicted protein can sometimes be surmised by the similarity of its sequence to that of a viral or cellular protein of known function. Such comparisons are carried out by searching international computer databases of nucleotide and amino acid sequences. It is also possible to find characteristic sequences of amino acids, or motifs, that indicate which domains will have particular functions, such as signal sequences for targeting proteins to the endoplasmic reticulum or the plasma membrane, transmembrane sequences, glycosvlation sites, and nucleotide binding sites. Short sequence motifs can also be identified which serve as signals in gene expression The particular methionine codon (AUG) that initiates translation at the beginning of all open reading frames is usually embedded in a consensus sequence GCCGCC/GCCAUGG. Sites of mRNA polyadenylation occur 10-30 bases downstream of the sequence AAUAAA. The start sites for transcription by RNA polymerase II are about 30 base pairs downstream from an A+T-rich sequence, the TATA box. And so on.
Was this article helpful?