user: pass:


Xu, Xiufeng; Arnason, U., 1997. The complete mitochondrial DNA sequence of the white rhinoceros, Ceratotherium simum, and comparison with the mtDNA sequence of the Indian rhinoceros, Rhinoceros unicornis.. Molecular Phylogenetics and Evolution 7 (2): 189-194, fig.1, tables 1-4

  details
 
Location: World
Subject: Genetics
Species: White Rhino


Original text on this topic:
Mitochondrial DNA sequence of Ceratotherium simum. [For tables, see file Xu & Arnason]
The complete nucleotide sequence of the mitochondrial genome of the white rhinoceros, Ceratotherium simum, was determined. The length of the reported sequence is 16,832 nucleotides. This length can vary, however, due to pronounced heteroplasmy caused by differing numbers of a repetitive motif (5'-CGCATATACA-3') in the control region. The 16,832 nucleotide sequence presented here is the longest version of the molecule and contains 35 copies of this motif. Comparison between the complete mitochondrial sequences of the white and the Indian (Rhinoceros unicornis) rhinoceroses allowed an estimate of the date of the basal evolutionary divergence among extant rhinoceroses. The calculation suggested that this divergence took place approximately 27 million years before present.
Introduction
The perissodactyl family Rhinocerotidae includes four recent genera, Rhinoceros (Indian and Javan rhinoceroses), Dicerorhinus (Sumatran rhinoceros), Diceros (black rhinoceros), and Ceratotherium (white rhinoceros). Rhinoceros and Dicerorhinus occur in Asia, whereas Diceros and Ceratotherium are found in Africa. The family Rhinocerotidae is distinct and well defined, but systematic relationships among the four genera (reviewed by Morales and Melnick, 1994) are controversial. Evolutionary relationships within the Rhinocerotidae have generally been evaluated on the basis of geographical distribution of the different genera or on the number of horns (one or two) characterizing these genera. In the present study we describe the complete mtDNA sequence of a two-horned species, the white rhinoceros, and compare it in detail with that of a one-horned species, the Indian rhinoceros (Xu et al., 1996b). This comparison makes it possible to address the date of the basal rhinocerotid divergence irrespective of whether it is defined by geographical distribution or number of horns. The date of divergence between the two rhinoceroses was estimated using the evolutionary separation of Artiodactyla and Cetacea set at 60 million years before present, M-YBP, as a reference (Arnason and Gullberg, 1996). The position of the black rhinoceros, Diceros bicornis, relative to the other two species was also examined; however, only a limited amount of sequence data is currently available for this species compared to the Indian and the white rhinoceroses.
Materials and methods
A total amount of 1.5 g of striated muscle from the white rhinoceros, Ceratotherium simum, was kindly provided by Dr. Peter Aretander, Department of Zoology, Univ. of Copenhagen, Denmark. An enriched mtDNA fraction was isolated as described by Arnason et al. (1991). The enriched mtDNA was digested separately with HindIII, Spel, XbaI, BlnI, BglII, and BclI. The products were ligated directly into M13 and cloned in Escherichia coli JM101. Positive clones were identified through hybridization, using mtDNA fragments from the horse and the donkey as labeled probes. The clones covered the entire molecule except for a 3,971- nucleotide (nt)-long region located between positions 11,622 and 15,592 of the complete sequence. This region was PCR amplified in two separate reactions prior to cloning. Six individual PCR clones from each reaction (all identical) were then used to determine the sequence of the region.
The sequence of the control region was determined by sequencing two natural (not PCR) clones. In addition, the number and sequence of repeat motifs in this region were determined from a total of 58 clones derived from PCR amplification. Sequencing was according to the dideoxy termination technique (Sanger, 1977) with [35S] dATP using both universal and numerous specific oligonueleotide primers.
Handling of sequences and alignments were performed with the GCG (1994) program package. Insert ion/deletion (indel) differences were counted as single agents irrespective of their lengths. Conservative nucleotide changes (Irwin et al., 1991) included all substitutions at the lst codon position except leucine transitions, all substitutions at the 2nd codon position, and transversions at the 3rd codon position.
The mtDNA sequence of the white rhinoceros has been deposited at EMBL with Accession No. Y07726. Users of this sequence are kindly requested to refer to the present paper in addition to the accession number of the sequence.
Results
The length of the mtDNA sequence presented here for the white rhinoceros, Ceratotherium simum, is 16,832 nt. As with other perissodactyl species for which complete mtDNA sequences are available-horse (Xu and Arnason, 1994), donkey (Xu et al., 1996a), and Indian rhinoceros (Xu et al., 1996b) - the control region of the white rhinoceros contains variable numbers of repeat motifs arranged in tandem. The lengths of individual mtDNA sequences of the white rhinoceros can therefore vary (heteroplasmy). The base composition of the L-strand, excluding the control region, is A, 33.4%; C, 27.9%; G, 12.9%; and T, 25.8%. In the control region of the reported sequence (35 repeats) the corresponding values are A, 33.7%; C, 28.9%; G, 12.0%; and T, 25.4%.
Positions of protein-coding genes were determined by the occurrence of start and stop codons and by analogy with other complete eutherian mtDNA sequences. The start codon of the NADH3 gene is ATT (isoleucine). All other genes have a methionine start codon. Three protein-coding genes - COIII, NADH3, and NADH4 - have incomplete stop codons (TA or T). In all three cases, the terminal nt is contiguous to the 5' terminal nt of a tRNA gene. As discussed by Wolstenholme and consistent with the findings of Ojala et al. the transcripts of such genes contain a stop codon created by posttranscriptional polyadenylation. Among other eutherians studied so far, only the fin and the blue whales (Arnason et al., 1991; Arnason and Gullberg, 1993) have a complete stop codon in COIII, while the NADH3 gene is terminated by a complete stop codon only in the mouse (Bibb et al., 1981), the rat (Gadaleta et al., 1989), and the hedgehog (Krettek et al., 1995). A complete stop codon has not yet been described in the NADH4 gene of any eutherian.
The control region of the sequence reported here is1,381 nt long with a continuous run of 35 repeat motifs (5'-CGCATATACA-3'). These repeats are located in the 3' part of the control region, between positions 16,187 and 16,536 of the complete sequence. Like most repeat motifs in eutherian control regions, the white rhinoceros motif is characterized by a purine/ pyrimidine alternation (Ghivizzani et al, 1993; Xu and Arnason, 1994). In order to determine the range of variation among different control regions, we sequenced 60 different clones (two natural plus 58 PCR clones) of the repeat part of the control region. The number of motifs among these clones varied from 10 to 35. Most sequences fall in the middle of the range of repeat numbers (see Fig. 1). The control region of the white rhinoceros, like that of the horse (Xu and Arnason, 1994), is characterized by only one type of repeat motif. Both species are heteroplasmic with respect to the number of repetitive motifs, but unlike the white rhinoceros, the distribution of repetitive motifs in the horse is bimodal with skewed distribution toward high number of motifs (see Fig. 1 in Xu and Arnason, 1994).
The sequence of the repeat motif in the control region of the white rhinoceros is identical to one and similar to the other (5'-CGCACACACA-3') repeat motif in the control region of the black rhinoceros (Jama et al., 1993).
Comparison with the mtDNA of Indian rhinoceros
Alignment of the complete mtDNA sequences for the white and the Indian rhinoceroses outside the control regions showed 17 indel differences between the two sequences. One indel (a codon triplet) was in the ATPase8 gene, which is 207 nt long in the white rhinoceros as compared to 204 nt in the Indian rhinoceros, the horse, and the donkey. This indel difference is at the C terminus of the gene. There were six indels in the 12S rRNA gene and eight in the 16S rRNA gene. One indel was observed in each of the tRNA-Leu(UUR) and the tRNA-Arg gene alignments. These indels were located in the D-loop region of the inferred secondary structure of the TRNA genes. Apart from indel differences, the two molecules (excluding the control regions) differed at 1,666 nt positions (10.8%).
The 13 protein-coding genes of the white and the Indian rhinoceroses were compared with respect to both total nt difference and conservative nt difference (Irwin et al., 1991), Table 1. The nt differences were examined with regards to both the codon position and the type of substitution (transition, transversion). The genes differ at 1,391 positions (12.2%). The codon position ratio for total nt difference is 2.6:1:11.1. The total number of conservative nt differences is 418 (3.7%). The codon position ratio for these differences is 1.5:1:1.9.
Table 2 shows total nt, conservative nt, and aa differences for each of the 13 protein-coding genes of the white and the Indian rhinoceroses. The genes have been arranged according to increasing aa difference. This order and that based on conservative nt difference are reasonably consistent, whereas the order based on total nt difference is markedly different. The relationship among different modes of comparisons has been discussed previously in comparisons of several closely related species pairs (Xu and Arnason, 1996a) and will therefore not be detailed here. The range for total nt difference is between 9.7% (NADH2) and 15.5% (ATPaseS), that for conservative nt difference is between 1.5% (COII) and 8.7% (ATPase8), and that for aa difference is between 1.2% (COI) and 18.8% (ATPaseS). The amplitude between the lowest and highest values for total nt difference is limited compared to that for conservative nt and aa difference, suggesting a high degree of saturation in the dataset for total nt difference.
Table 3 gives details of four pairwise perissodactyl comparisons, horse/donkey, white/Indian rhinoceroses, horse/white rhinoceros, and horse/Indian rhinoceros. The horse and the donkey form the most closely related species-pair, whereas the comparison between the horse and the rhinoceroses represents the deepest perissodaectyl divergence. We have previously (Xu et al., 1996a) dated the divergence between the horse and the donkey to ca 9 M-YBP, and that between the families Equidae and Rhinocerotidae to ca 50 MYBP (Xu et al., 1996b). With respect to total nt difference a saturation effect is pronounced in the comparison between the horse and the rhinoceroses, leading to a difference that is only 2.3 times greater than that between the horse and the donkey. The differences among the 12S rRNA and the 16S rRNA genes, respectively, in the three divergences (counting the relationship between the horse and the two rhinoceroses as one divergence) differs somewhat from that based on aa and conservative nt differences. While this discrepancy may to some extent be due to stochastic fluctuations, the effect of multiple hits (both transitions and transversions) in fast evolving rRNA sites cannot be ruled out.
The differences in codon position ratios (both total and conservative nt substitution) in the three divergences are consistent with differences in rates and mode of substitution at the three codon positions.
The combined sequences of the TRNA genes of the two rhinoceroses differ by 5.8%. The greatest differences (eight transitions and one transversion, respectively) were registered between the tRNA-Glu and tRNA-Thr genes.
The sequences of two complete tRNA genes, tRNA-Phe and tRNA-Pro, from the black rhinoceros were reported by Jama et al. (1993). The tRNA-Phe genes of the white and the black rhinoceroses differ by two transitions, as compared to seven transitions and one transversion between the white and the Indian rhinoceroses. The tRNA-Pro genes of the white and the black rhinoceroses differ by three transitions, one less than that between the white and the Indian rhinoceroses.
The stem region of the origin of replication of the L-strand is identical in the white and the Indian rhinoceroses, while the loop regions differ by three transitions.

[ Home ][ Literature ][ Rhino Images ][ Rhino Forums ][ Rhino Species ][ Links ][ About V2.0]