Get help from the best in academic writing.

The Characteristics of Retroviruses

Retroviruses have various characteristics that make them unique as gene delivery vehicles. Their life cycle includes an integrated state in the DNA of the host chromosome.
Retroviruses are the only animal viruses that integrate into the host cell’s genome during the normal growth cycle. They use an integrase that acts in a site-specific manner to join the ends of the viral cDNA to target sequences in host cell DNA. The linear ds cDNA made in the cytoplasm is transported to the nucleus where it is also found as circles and as integrated DNA. Two forms of circular DNA are generally found: one having a single Long Terminal Repeat (LTR) and one having two LTRs. It is now thought that the original integrated proviruses were linear molecules with two LTRs.
The retroviral promoter can direct high-level, efficient expression of genes encoded within the viral capsid of its genome using chromatin.
The retroviral genomes can accommodate changes to its configuration.
Retroviruses offer gene therapy researchers aid for delivering genes to target cells at high efficiency that allows for long-term, stable expression of introduced genetic elements
The retroviral life cycle begins in the nucleus of an infected cell.
At the beginning of the life cycle the retroviral genome is a DNA element integrated into and covalently attached to the DNA of the host cell.
Full-length genomic mRNA is made starting at the beginning of the repeat at the 5? LTR (Long Terminal Repeat).
The free particle can infect new cells by binding to a cell surface receptor. The specificity of the virus-cell interaction is determined most commonly by the envelope proteins of the retrovirus. Infection leads to injection of the virus nucleoprotein core (consisting of many gag-derived proteins, full-length genomic RNA, and the reverse transcriptase protein).
Once inside the cell, the nucleoprotein complex accesses intracellular DNA nucleotide triphosphate pools, where the reverse transcriptase protein initiates and creation of a double-stranded DNA copy of the genome of the virus is prepared for integration into the host cell chromosome. When reverse transcription is completed, the viral enzyme integrase looks for an appropriate storage place for the DNA, which the integrase clips the host DNA to and binds the double-stranded DNA into the host DNA.
The virus is the able to initiate a new round of replication again.
3 major proteins encoded in a retroviral genome
Gag is a polyprotein and is an acronym for Group Antigens (ag).
Pol is the reverse transcriptase.
Env is the envelope protein.
The group antigens form the viral core structure and are the major proteins which comprise the nucleoprotein core particles.
Reverse transcriptase is the essential enzyme that carries out the reverse transcription process that take the RNA genome to a double-stranded DNA preintegrate form. General transcription and proteins are encoded from spliced mRNA of retroviruses.
Transcription proceeds through the genome and mRNA is polyadenylated and processed using signals in transcribed regions from the 3? LTR at the end of the transcribed R (repeat). The full-length message can be spliced to lead to production of envelope proteins (or other proteins depending upon retroviral class). Unspliced full-length mRNA can give rise to gag-pol proteins. Gag and Pol are made as either Gag protein or a Gag-Pol precursor.
Translated proteins assemble a retroviral particle at the cell surface. Full-length genomic unspliced mRNA is bound by gag-derived proteins and incorporated into the budding particle.
Virion structures – In retroviruses particle shapes can be divided into distinct categories:
A-type particles are immature intracellular forms derived from endogenous retrovirus-like elements and the immature form of MMTV.
B-type particles correspond to the extracellular form of MMTV and are characterised by prominent surface protein “spikes” and a dense asentric nucleocapsid.
C-type particles form at the surface of the cell at the site of budding. Lentiviruses bud like C type particles but have a distinctive blunted cone shaped core.
D-type particles are the MMPV related viruses of sub-human primates, and differ from B-type particles by a lack of surface spikes.
The gag (group specific antigen) gene encodes the viral matrix, capsid and nucleoproteins
The protease encodes a product that cleaves the gag polyprotein precursor. It can be encoded as part of Gag or a Gag-Pro-Pol polyprotein
The major read-through product is derived from the pol gene which encodes the reverse transcriptase and an integrase which is involved in provirus integration.
The envelope gene encodes the surface glycoprotein (SU) – transmembrane (TM) polyprotein.
Viral entry Retroviruses enter by at least two different manners, dependent upon the retroviral subclass. The viral envelope is critical in each case for recognising appropriate surface receptors to initiate viral fusion to the host target cells.
The RNA genome in the free retrovirus is arranged as a diploid genome with identical sequences. The mRNA associates with a tRNA primer (pro, trp, or lys) that is bound by complementary base pairing to 18 base pairs to the U5 region.
The integrated form (proviral) of all retroviruses contain transcription regulatory sequences primarily in Long Terminal Repeats (LTR). LTR sequences are derived from sequences unique to the 5? end of viral RNA (U5), from sequences unique to the 3? end of viral RNA (U3), and from sequences repeated at both ends of the viral RNA. The integrated provirus is larger than the viral genome but its complexity is the same because of duplication of U3 and U5 during synthesis.
Replication of retroviruses is sensitive to the transcription inhibitors Actinomycin D, alpha-amanitin nucleoside and analogues like 5-bromodioxyuridine and cytosine arabinoside. 5 bromodioxyuridine and cytosine arabinoside are thought to inhibit DNA replication.

History of DNA Sequencing and Research

DNA sequencing technology has evolved very rapidly since its inception in the 1970s, and continues to evolve and grow today. This paper will review the major innovations and developments in sequencing technology and briefly summarize their methodologies.
The first group that was able to sequence DNA was the team of Allan Maxam and Walter Gilbert (Maxam and Gilbert). This was a first generation sequencing reaction, and was developed in 1976-1977. This method uses purified DNA and relies on chemical modification of DNA bases (like depurination of adenine and guanine using formic acid and methylation using hydrazine or dimethyl sulfate). The 5′ end is radioactively labeled so that it can be visualized in a gel, and then fragments of modified DNA are electrophoresed. Autoradiography can then be used to visualize the sizes of each DNA fragment. The maximum read length for this technique was approximately 100 bases long.
The next major innovation in DNA sequencing was the Sanger dideoxy chain termination method. This was developed in 1977 by Frederick Sanger (Sanger, Nicklen, and Coulson), and became much more popular than Maxam and Gilbert’s method. Sanger sequencing is a synthesis reaction and uses dideoxy nucleotides to randomly terminate synthesized strands of DNA. The DNA strands that had been terminated with ddNTPs originally were run in 4 different lanes (one for each ddNTP) and were radiolabeled so that they could be visualized with autoradiography. Later innovations made Sanger sequencing even easier when each dideoxynucleotide was labeled with different fluorescent dyes. As such, sequences could be run on a single gel in a single lane. This method was the most popular way of sequencing DNA for many years, and was prevalent until about 2004. While read length was initially about 100 base pairs long, Sanger sequencing now has a read length of about 800 to 1000 base pairs long when run in capillary gels.
With the start of the human genome project, it was necessary to find ways to sequence DNA much more quickly and more cost-effectively than had been done previously. This led to the development of so-called “second generationâ€Â DNA sequencers. It also allowed for the use of smaller samples for sequencing.
One of the first major automated platforms was the Roche 454 (Margulies et al.). This utilizes pyrosequencing, which is a synthesis type sequencing reaction. This also uses emulsion PCR on beads. When a dNTP is incorporated, it releases a pyrophosphate (PPi). ATP sulfurylase is present in the reaction mix, and when PPi is released, converts it to ATP, which can activate luciferase and the emission of light. The Roche 454 can measure the amount of light given off and relate it to the number of nucleotides that have been incorporated. One problem with this type of sequencing is that it can be difficult to accurately characterize sequences of the same nucleotide in a row as the intensity of the pyrophosphate peak given off does not have a linear relationship with the number of homopolymers present. The read length for 454 is approximately 250 base pairs long, and the error mode tends to have indels.
The next major second gen sequencer is the Illumina Solexa platform (Bennett). The chemistry of this platform is that it utilizes reversible terminators and sequences by synthesis. A flow cell is covered with DNA oligonucleotides that are complementary to adaptor sequences that have been ligated to the ends of fragmented genome pieces. As the genome fragments are streamed across the surface of the flow cell, they will randomly bind and go through multiple cycles of denaturation and extension, which creates clusters of clones. After these clusters have been generated, they are loaded into a sequencer which measures fluorescent signals as single nucleotides are incorporated by taking a picture and noting the location of fluorescence. Read lengths are about 26-50 bases on average, and the types of errors that are typically present tend to be SNP errors.
Another important second generation sequencer is the ABI-SOLiD (Sequencing by Oligonucleotide Ligation and Detection) sequencing platform (Valouev et al.). This is another sequencing by synthesis reaction, but unlike Illumina and 454, which use polymerases, this uses ligases. After using emulsion PCR on beads to create clonal clusters, primers base pair to a known adapter sequence that has been ligated to the genomic DNA. Differently labeled probes competitively base pair to the sequencing primer, and sequencing goes through several cycles in which different primers are used each time to bind to positions offset by a single nucleotide each time. DNA bases are added in groups of two in this method. Average read lengths for this technique are on average about 35 base pairs long.
The next second generation sequencing technique is Ion Torrent, which is a sequencing by synthesis technique ( When nucleotides are added to a growing DNA chain, pyrophosphate and a hydrogen ion are released. Ion Torrent takes advantage of this by measuring the pH of the reaction mix after flooding a DNA strand with the four bases (one at a time) to determine sequences. One major advantage of this technique is that it doesn’t require a high-cost camera set-up to measure incorporation events. However, because it indirectly measures nucleotide addition through changes in pH, it has difficulty with accuracy in calling sequences of homopolymers, resulting in indel errors (like pyrosequencing). Average read lengths using this technique are about 200 base pairs long.
A more recent innovation is the Helicos-True Single Molecule Sequencing (tSMS) technique (Thompson and Steinmann). It is somewhat similar to Illumina sequencing in that it also uses fragmented DNA, adaptors, and fluorescently labeled dNTPs, but there is no amplification step. This helps eliminate issues with GC bias, which tend to affect amplification steps and can cause errors in base calling. Average read length is greater than 25 base pairs.
Pacific Biosciences’ SMRT technology (Single Molecule Real Time sequencing) immobilizes a DNA polymerase at the bottom of a well and is a sequencing by synthesis technique (Eid et al.). Fluorescently labeled phosphate groups in dNTPs are added to the reaction mix and as the base is added to the growing DNA strand, the machine can measure the light that is given off (each base is labeled with a different fluorescent molecule). The major advantage of this technique is that it can sequence very long reads (more than 1000 bp!) which is very important in de novo sequence assembly. In addition, PacBio can also measure methylation of DNA sequences based on the kinetics of addition of base pairs (using the observation that modified base pairs tend to take longer to incorporate into a DNA strand). Furthermore, this technique can also potentially use a single molecule of DNA, which reduces any GC bias that occurs due to amplification.
The final technique that will be discussed here is nanopore sequencing (Stoddart et al.). The idea behind this is that DNA may be threaded through a nanopore one base at a time. As it’s fed through, the sequencer can measure the change in current as it passes through (which will vary based on what base is moving through the pore). Thus, the sequence can be determined straight from the DNA without the need for modifications or reagents. In addition, because this can be done on a single molecule, there is again no need for amplification and thus no possibility of any GC bias in base calls.