Term used in genetics. A transcription unit is the transcribed sequence together with the promoter and terminator.
Transcription unit
DefinitionThis section has been translated automatically.
General informationThis section has been translated automatically.
The human genome contains around 3 billion base pairs, which carry the information of around 23,000 genes. Only about 1% of DNA is transcribed into mRNA. The genes that are transcribed vary in size, with an average length of around 3,000 base pairs. However, the human genome also contains some giant genes. For example, the gene responsible for Duchenne muscular dystrophy, which comprises around 2.5 million base pairs.
Before a gene can be transcribed, it must be found in the genome. Furthermore, transcription does not take place along the entire DNA strand, but only along a specific section. This is defined by the initiation site and the termination site (terminator). The transcribed sequence is located between these two points.
The RNA polymerase begins transcription at the initiation site, which is located in a promoter. The promoter is a specific region in the DNA strand that is recognized by the polymerase. In eukaryotes, for example, RNA polymerase II locates the promoter by the presence of the so-called TATA box. This is named after its base sequence TATAAA. The TATA sequence tells the transcription enzyme that the sequence to be transcribed is only about 30 base pairs away in the 5' direction (often referred to as downstream). Sequences such as the TATAAA sequence in the TATA box, which occur in almost all living organisms, are also known as consensus sequences. These sequences always have the same meaning! The TATA box not only indicates the position of the gene, but also on which strand the information is located. In fact, the genes can be located on both strands, but each gene is only read in the 5' direction and the mRNA is synthesized from 5' to 3'.
The so-called transcription factors can help the RNA polymerase to find the initiation site for transcription. The DNA is always present as a double helix. The RNA polymerase spreads the two strands of the DNA molecule and moves in the 3′ to 5′ direction of the DNA or from 5′ to 3′ on the pre-messenger RNA. The designation 5' or 3' for the ends of a DNA or RNA strand comes from the nucleotides that make up DNA and RNA: all nucleotides have a pentose, a sugar with five carbon atoms. The nucleotides are connected to each other at the pentoses, and the 5th carbon atom of one nucleotide is always connected to the 3rd carbon atom of the next nucleotide. Therefore, at one end of a DNA or RNA strand there is a nucleotide with a free 5th carbon and at the other end there is a nucleotide with a free 3rd carbon.
Opposite the coding strand of DNA, the polymerase joins nucleotides that pair with those of DNA: the nucleotides G (guanine) and C (cytosine) fit together, as do the nucleotide A (adenine) with T (thymine) and U (uracil). In contrast to DNA, RNA has no base T (thymine), but instead the base U (uracil).
During elongation, a pre-messenger RNA is synthesized. The RNA polymerase stops at the termination site of transcription. The pre-messenger RNA then matures into mRNA, which is translated into a protein on ribosomes in the cytoplasm (translation).