Members of the Coronaviridae family are enveloped, non-segmented, positive-strand RNA viruses with genome sizes ranging from 26–32 kb. These viruses are classified into two subfamilies: Letovirinae, which contains the only genus: Alphaletovirus; and Orthocoronavirinae (CoV), which consists of alpha, beta, gamma, and deltacoronaviruses (CoVs). Alpha and betacoronaviruses mainly infect mammals and cause human and animal diseases. Gamma- and delta-CoVs mainly infect birds, but some can also infect mammals. Six human CoVs (HCoVs) are known to cause human diseases. HCoV-HKU1, HCoV-OC43, HCoV-229E, and HCoV-NL63 commonly cause mild respiratory illness or asymptomatic infection; however, severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV) have caused severe disease with a 10% or 35% mortality, respectively. CoVs infection in domestic animals can also cause great economic losses, such as transmissible gastroenteritis virus, porcine epidemic diarrhea virus, and HKU2-related CoV in pigs.
Prior to the global SARS outbreak in 2002 to 2003, only 10 CoVs were reported. Since the outbreak, numerous CoVs have been discovered in animals, particularly, in bats. According to a recent report by the International Committee of Viruses on Taxonomy (ICTV), at least 17 out of 29 assigned alpha and beta-CoV species were identified from 11 out of 18 bat families. Phylogenetic analysis suggested that bats are major hosts for alpha- and beta-CoVs. Recombination of different CoVs occurred in bats, as previously reported. Bats play an important role in CoV evolution.
Rhinolophus bats are widespread in China. At least 4 CoV species with high genetic diversity have been found in members of this family. Among these viruses, bat SARS-related coronaviruses (SARSr-CoVs) have been proved to be able to infect animal and human cells by using the same receptor as SARS-CoV. Recently, a new porcine disease was confirmed to be caused by BatCoV HKU2-related virus in Guangdong Province, China. These findings indicate that these bat species play important roles in CoV evolution and transmission.
Here, we report a novel species of alpha-CoV discovered in Rhinolophus bats in China, their unique genomic structures and a preliminary functional assessment of accessory genes, as well as this virus’ infectivity in different cells.
2.1. Ethics Statement
All sampling procedures were performed by veterinarians, with approval from Animal Ethics Committee of the Wuhan Institute of Virology (WIVH5210201). The study was conducted in accordance with the Guide for the Care and Use of Wild Mammals in Research of the People’s Republic of China.
Bat fecal swab and pellet samples were collected from November 2004 to November 2014 in different seasons in Southern China, as described previously.
2.3. RNA Extraction, PCR Screening and Sequencing
Viral RNA was extracted from 200 μL of fecal swab or pellet samples using the High Pure Viral RNA Kit (Roche Diagnostics GmbH, Mannheim, Germany) as per the manufacturer’s instructions. RNA was eluted in 50 μL of elution buffer, aliquoted, and stored at –80 °C. One-step hemi-nested reverse-transcription (RT-) PCR (Invitrogen, San Diego, CA, USA) was employed to detect coronavirus, as previously described.
To confirm the bat species of an individual sample, we PCR amplified the cytochrome b (Cytob) and/or NADH dehydrogenase subunit 1 (ND1) gene using DNA extracted from the feces or swabs. The gene sequences were assembled excluding the primer sequences. BLASTN was used to identify host species based on the most closely related sequences with the highest query coverage and a minimum identity of 95%.
2.4. Sequencing of Full-Length Genomes
Full genomic sequences were determined by one-step PCR (Invitrogen, San Diego, CA, USA) amplification with degenerate primers (Table S1) designed on the basis of multiple alignments of available alpha-CoV sequences deposited in GenBank or amplified with SuperScript IV Reverse Transcriptase (Invitrogen) and Expand Long Template PCR System (Roche Diagnostics GmbH, Mannheim, Germany) with specific primers (primer sequences are available upon request). Sequences of the 5’ and 3’ genomic ends were obtained by 5’ and 3’ rapid amplification of cDNA ends (SMARTer RACE 5’/3’ Kit; Clontech, Mountain View, CA, USA), respectively. PCR products were gel-purified and subjected directly to sequencing. PCR products over 5kb were subjected to deep sequencing using Hiseq2500 system. For some fragments, the PCR products were cloned into the pGEM-T Easy Vector (Promega, Madison, WI, USA) for sequencing. At least five independent clones were sequenced to obtain a consensus sequence.
2.5. Genome Analysis
The Next Generation Sequencing (NGS) data were filtered and mapped to the reference sequence of BatCoV HKU10 (GenBank accession number NC_018871) using Geneious 7.1.8. Genomes were preliminarily assembled using DNAStar lasergene V7 (DNAStar, Madison, WI, USA). Putative open reading frames (ORFs) were predicted using NCBI’s ORF finder (https://www.ncbi.nlm.nih.gov/orffinder/) with a minimal ORF length of 150 nt, followed by manual inspection. The sequences of the 5’ untranslated region (5’-UTR) and 3’-UTR were defined, and the leader sequence, the leader and body transcriptional regulatory sequence (TRS) were identified as previously described. The cleavage of the 16 nonstructural proteins coded by ORF1ab was determined by alignment of aa sequences of other CoVs and the recognition pattern of the 3C-like proteinase and papain-like proteinase. Phylogenetic trees based on nt or aa sequences were constructed using the maximum likelihood algorithm with bootstrap values determined by 1000 replicates in the MEGA 6 software package. Full-length genome sequences obtained in this study were aligned with those of previously reported alpha-CoVs using MUSCLE. The aligned sequences were scanned for recombination events by using Recombination Detection Program. Potential recombination events as suggested by strong p-values (<10–20) were confirmed using similarity plot and bootscan analyses implemented in Simplot 3.5.1. The number of synonymous substitutions per synonymous site, Ks, and the number of nonsynonymous substitutions per nonsynonymous site, Ka, for each coding region were calculated using the Ka/Ks calculation tool of the Norwegian Bioinformatics Platform (http://services.cbu.uib.no/tools/kaks) with default parameters. The protein homology detection was analyzed using HHpred (https://toolkit.tuebingen.mpg.de/#/tools/hhpred) with default parameters.
2.6. Transcriptional Analysis of Subgenomic mRNA
A set of nested RT-PCRs was employed to determine the presence of viral subgenomic mRNAs in the CoV-positive samples. Forward primers were designed targeting the leader sequence at the 5’-end of the complete genome, while reverse primers were designed within the ORFs. Specific and suspected amplicons of expected sizes were purified and then cloned into the pGEM-T Easy vector for sequencing.
2.7. Cell Lines, Gene Cloning, and Expression
Bat primary or immortalized cells (Rhinolophus sinicus kidney immortalized cells, RsKT; Rhinolophus sinicus Lung primary cells, RsLu4323; Rhinolophus sinicus brain immortalized cells, RsBrT; Rhinolophus affinis kidney primary cells, RaK4324; Rousettus leschenaultii Kidney immortalized cells, RlKT; Hipposideros pratti lung immortalized cells, HpLuT) generated in our laboratory were all cultured in DMEM/F12 with 15% FBS. Pteropus alecto kidney cells (Paki) was maintained in DMEM/F12 supplemented with 10% FBS. Other cells were maintained according to the recommendations of American Type Culture Collection (ATCC, www.atcc.org).
The putative accessory genes of the newly detected virus were generated by RT-PCR from viral RNA extracted from fecal samples, as described previously. The influenza virus NS1 plasmid was generated in our lab. The human bocavirus (HBoV) VP2 plasmid was kindly provided by prof. Hanzhong Wang of the Wuhan Institute of Virology, Chinese Academy of Sciences. SARS-CoV ORF7a was synthesized by Sangon Biotech. The transfections were performed with Lipofectamine 3000 Reagent (Life Technologies). Expression of these accessory genes were analyzed by Western blotting using an mAb (Roche Diagnostics GmbH, Mannheim, Germany) against the HA tag.
2.8. Virus Isolation
The virus isolation was performed as previously described. Briefly, fecal supernatant was acquired via gradient centrifugation and then added to Vero E6 cells, 1:10 diluted in DMEM. After incubation at 37 ℃ for 1 h the inoculum was replaced by fresh DMEM containing 2% FBS and the antibiotic-antimycotic (Gibco, Grand Island, NY, USA). Three blind passages were carried out. Cells were checked daily for cytopathic effect. Both culture supernatant and cell pellet were examined for CoV by RT-PCR.
2.9. Apoptosis Analysis
Apoptosis was analyzed as previously described. Briefly, 293T cells in 12-well plates were transfected with 3 μg of expression plasmid or empty vector, and the cells were collected 24 h post transfection. Apoptosis was detected by flow cytometry using by the Annexin V-FITC/PI Apoptosis Detection Kit (YEASEN, Shanghai, China) following the manufacturer’s instructions. Annexin-V-positive and PI-negative cells were considered to be in the early apoptotic phase and those stained for both Annexin V and PI were deemed to undergo late apoptosis or necrosis. All experiments were repeated three times. Student’s t-test was used to evaluate the data, with p < 0.05 considered significant.
2.10. Dual Luciferase Reporter Assays
HEK 293T cells were seeded in 24-well plates and then co-transfected with reporter plasmids (pRL-TK and pIFN-βIFN- or pNF-κB-Luc), as well as plasmids expressing accessory genes, empty vector plasmid pcAGGS, influenza virus NS1, SARS-CoV ORF7a, or HBoV VP2. At 24 h post transfection, cells were treated with Sendai virus (SeV) (100 hemagglutinin units [HAU]/mL) or human tumor necrosis factor alpha (TNF-α; R&D system) for 6 h to activate IFNβ or NF-κB, respectively. Cell lysates were prepared, and luciferase activity was measured using the dual-luciferase assay kit (Promega, Madison, WI, USA) according to the manufacturer’s instructions.
2.11. BtCoV/Rh/YN2012 Spike-Mediated Pseudoviruses Cell Tropism Screening
Retroviruses pseudotyped with BtCoV/Rh/YN2012 RsYN1, RsYN3, RaGD, or MERS-CoV spike, or no spike (mock) were used to infect human, bat or other mammalian cells in 96-well plates. The pseudovirus particles were confirmed with Western blotting and negative-staining electromicroscopy. The production process, measurements of infection and luciferase activity were conducted, as described previously.
2.12. Nucleotide Sequence Accession Numbers
The complete genome nucleotide sequences of BtCoV/Rh/YN2012 strains RsYN1, RsYN2, RsYN3, and RaGD obtained in this study have been submitted to the GenBank under MG916901 to MG916904.
3.1. CoVs Detected in Rhinolophus Bats
The surveillance was performed between November 2004 to November 2014 in 19 provinces of China. In total, 2061 fecal samples were collected from at least 12 Rhinolophus bat species (Figure 1A). CoVs were detected in 209 of these samples (Figure 1B and Table 1). Partial RdRp sequences suggested the presence of at least 8 different CoVs. Five of these viruses are related to known species: Mi-BatCoV 1 (>94% nt identity), Mi-BatCoV HKU8 (>93% nt identity), BtRf-AlphaCoV/HuB2013 (>99% nt identity), SARSr-CoV (>89% nt identity), and HKU2-related CoV (>85% nt identity). While the other three CoV sequences showed less than 83% nt identity to known CoV species. These three viruses should represent novel CoV species. Virus isolation was performed as previously described, but was not successful.
3.2. Genomic Characterization of a Novel Alpha-CoV (BtCoV/Rh/YN2012)
We next characterized a novel alpha-CoV, BtCoV/Rh/YN2012. It was detected in 3 R.affinis and 6 R.sinicus, respectively. Based on the sequences, we defined three genotypes, which represented by RsYN1, RsYN3, and RaGD, respectively. Strain RsYN2 was classified into the RsYN3 genotype. Four full-length genomes were obtained. Three of them were from R.sinicus (Strain RsYN1, RsYN2, and RsYN3), while the other one was from R.affinis (Strain RaGD). The sizes of these 4 genomes are between 28,715 to 29,102, with G+C contents between 39.0% to 41.3%. The genomes exhibit similar structures and transcription regulatory sequences (TRS) that are identical to those of other alpha-CoVs (Figure 2 and Table 2). Exceptions including three additional ORFs (ORF3b, ORF4a and ORF4b) were observed. All the 4 strains have ORF4a & ORF4b, while only strain RsYN1 has ORF3b.
The replicase gene, ORF1ab, occupies ~20.4 kb of the genome. The replicase gene, ORF1ab, occupies ~20.4 kb of the genome. It encodes polyproteins 1a and 1ab, which could be cleaved into 16 non-structural proteins (Nsp1–Nsp16). The 3’-end of the cleavage sites recognized by 3C-like proteinase (Nsp4-Nsp10, Nsp12-Nsp16) and papain-like proteinase (Nsp1–Nsp3) were confirmed. The proteins including Nsp3 (papain-like 2 proteas, PL2pro), Nsp5 (chymotrypsin-like protease, 3CLpro), Nsp12 (RdRp), Nsp13 (helicase), and other proteins of unknown function (Table 3). The 7 concatenated domains of polyprotein 1 shared <90% aa sequence identity with those of other known alpha-CoVs (Table 2), suggesting that these viruses represent a novel CoV species within the alpha-CoV. The closest assigned CoV species to BtCoV/Rh/YN2012 are BtCoV-HKU10 and BtRf-AlphaCoV/Hub2013. The three strains from Yunnan Province were clustered into two genotypes (83% genome identity) correlated to their sampling location. The third genotype represented by strain RaGD was isolated to strains found in Yunnan (<75.4% genome identity).
We then examined the individual genes (Table 2). All of the genes showed low aa sequence identity to known CoVs. The four strains of BtCoV/Rh/YN2012 showed genetic diversity among all different genes except ORF1ab (>83.7% aa identity). Notably, the spike proteins are highly divergent among these strains. Other structure proteins (E, M, and N) are more conserved than the spike and other accessory proteins. Comparing the accessory genes among these four strains revealed that the strains of the same genotype shared a 100% identical ORF3a. However, the proteins encoded by ORF3as were highly divergent among different genotypes (<65% aa identity). The putative accessory genes were also BLASTed against GenBank records. Most accessory genes have no homologues in GenBank-database, except for ORF3a (52.0–55.5% aa identity with BatCoV HKU10 ORF3) and ORF9 (28.1–32.0% aa identity with SARSr-CoV ORF7a). We analyzed the protein homology with HHpred software. The results showed that ORF9s and SARS-CoV OR7a are homologues (possibility: 100%, E value <10−48). We further screened the genomes for potential recombination evidence. No significant recombination breakpoint was detected by bootscan analysis.
3.3. Subgenomic Structures and Accessory Genes of BtCoV/Rh/YN2012
To confirm the presence of subgenomic RNA, we designed a set of primers targeting all the predicted ORFs as described. The amplicons were firstly confirmed via agarose-gel electrophoresis and then sequencing (Figure 3 and Table 2). The sequences showed that all the ORFs, except ORF4b, had preceding TRS. Hence, the ORF4b may be translated from bicistronic mRNAs. In RsYN1, an additional subgenomic RNA starting inside the ORF3a was found through sequencing, which led to a unique ORF3b.
3.4. Phylogenetic Analysis
Phylogenetic trees were constructed using the aa sequences of RdRp and S of BtCoV/Rh/YN2012 and other representative CoVs (Figure 4). In both trees, all BtCoV/Rh/YN2012 were clustered together and formed a distinct lineage to other known coronavirus species. Two distinct sublineages were observed within BtCoV/Rh/YN2012. One was from Ra sampled in Guangdong, while the other was from Rs sampled in Yunnan Among the strains from Yunnan, RsYN2 and RsYN3 were clustered together, while RsYN1 was isolated. The topology of these four strains was correlated to the sampling location. The relatively long branches reflect a high diversity among these strains, indicating a long independent evolution history.
3.5. Estimation of Synonymous and Nonsynonymous Substitution Rates
The Ka/Ks ratios (Ks is the number of synonymous substitutions per synonymous sites and Ka is the number of nonsynonymous substitutions per nonsynonymous site) were calculated for all genes. The Ka/Ks ratios for most of the genes were generally low, which indicates these genes were under purified selection. However, the Ka/Ks ratios of ORF4a, ORF4b, and ORF9 (0.727, 0.623, and 0.843, respectively) were significantly higher than those of other ORFs (Table 4). For further selection pressure evaluation of the ORF4a and ORF4b gene, we sequenced another four ORF4a and ORF4b genes (strain Rs4223, Rs4236, Rs4240, and Ra13576 was shown in Figure 1B). The Ka/Ks ratios of these genes detected in 2012 (two strains) and 2013 (6 strains) were calculated, respectively. A reduction of Ka/Ks was observed from 2012 to 2013 (4a: 1.135 to 0.487; 4b: 4.489 o 1.764).
3.6. Apoptosis Analysis of ORF9
As SARS-CoV ORF7a was reported to induce apoptosis, we conducted apoptosis analysis on BtCoV/Rh/YN2012 ORF9, a ~30% aa identity homologue of SARSr-CoV ORF7a. We transiently transfected ORF9 of BtCoV/Rh/YN2012 into HEK293T cells to examine whether this ORF9 triggers apoptosis. Western blot was performed to confirm the expression of ORF9s and SARS-CoV ORF7a (Figure S1). ORF9 couldn’t induce apoptosis as the ORF7a of SARS-CoV Tor2 (Figure S2). The results indicated that BtCoV/Rh/YN2012 ORF9 was not involved in apoptosis induction.
3.7. ORF4a Proteins Induce Production of IFN-β
To determine whether these accessory proteins modulate IFN induction, we transfected reporter plasmids (pIFNβ-Luc and pRL-TK) and expression plasmids to 293T cells. All the cells over-expressing the accessory genes, as well as influenza virus NS1 (strain PR8), HBoV VP2, or empty vector were tested for luciferase activity after SeV infection. Luciferase activity stimulated by SeV was remarkably higher than that without SeV treatment as expected. Influenza virus NS1 inhibits the expression from IFN promoter, while HBoV VP2 activate the expression. Compared to those controls, the ORF4a proteins exhibit an active effect as HBoV VP2 (Figure 5A). Other accessory proteins showed no effect on IFN production (Figure S3). Expression of these accessory genes were confirmed by Western blot (Figure S1).
3.8. ORF3a Proteins Modulate NF-κB
NF-κB plays an important role in regulating the immune response to viral infection and is also a key factor frequently targeted by viruses for taking over the host cell. In this study, we tested if these accessory proteins could modulate NF-κB. 293T cells were co-transfected with reporter plasmids (pNF-κB-Luc and pRL-TK), as well as accessory protein-expressing plasmids, or controls (empty vector, NS1, SARS-CoV Tor2-ORF7a). The cells were mock treated or treated with TNF-α for 6 h at 24 h post-transfection. The luciferase activity was determined. RsYN1-ORF3a and RaGD-ORF3a activated NF-κB as SARS-CoV ORF7a, whereas RsYN2-ORF3a inhibited NF-κB as NS1 (Figure 5B). Expressions of ORF3as were confirmed with Western blot (Figure S1). Other accessory proteins did not modulate NF-κB production (Figure S4).
3.9. BtCoV/Rh/YN2012 Spike Mediated Pseudovirus Entry
To understand the infectivity of these newly detected BtCoV/Rh/YN2012, we selected the RsYN1, RsYN3 and RaGD spike proteins for spike-mediated pseudovirus entry studies. Both Western blot analysis and negative-staining electron microscopy observation confirmed the preparation of BtCoV/Rh/YN2012 successfully (Figure S5). A total of 11 human cell lines, 8 bat cells, and 9 other mammal cell lines were tested, and no strong positive was found (Table S2).
In this study, a novel alpha-CoV species, BtCoV/Rh/YN2012, was identified in two Rhinolophus species. The 4 strains with full-length genome were sequences. The 7 conserved replicase domains of these viruses possessed <90% aa sequence identity to those of other known alpha-CoVs, which defines a new species in accordance with the ICTV taxonomy standard. These novel alpha-CoVs showed high genetic diversity in their structural and non-structural genes. Strain RaGD from R. affinis, collected in Guangdong province, formed a divergent independent branch from the other 3 strains from R. sinicus, sampled in Yunnan Province, indicating an independent evolution process associated with geographic isolation and host restrain. Though collected from same province, these three virus strains formed two genotypes correlated to sampling locations. These two genotypes had low genome sequence identity, especially in the S gene and accessory genes. Considering the remote geographic location of the host bat habitat, the host tropism, and the virus diversity, we suppose BtCoV/Rh/YN2012 may have spread in these two provinces with a long history of circulation in their natural reservoir, Rhinolophus bats. With the sequence evidence, we suppose that these viruses are still rapidly evolving.
Our study revealed that BtCoV/Rh/YN2012 has a unique genome structure compared to other alpha-CoVs. First, novel accessory genes, which had no homologues, were identified in the genomes. Second, multiple TRSs were found between S and E genes while other alphacoronavirus only had one TRS there. These TRSs precede ORF3a, ORF3b (only in RsYN1), and ORF4a/b respectively. Third, accessory gene ORF9 showed homology with those of other known CoV species in another coronavirus genus, especially with accessory genes from SARSr-CoV.
Accessory genes are usually involved in virus-host interactions during CoV infection. In most CoVs, accessory genes are dispensable for virus replication. However, an intact 3c gene of feline CoV was required for viral replication in the gut. Deletion of the genus-specific genes in mouse hepatitis virus led to a reduction in virulence. SARS-CoV ORF7a, which was identified to be involved in the suppression of RNA silencing, inhibition of cellular protein synthesis, cell-cycle blockage, and apoptosis induction. In this study, we found that BtCoV/Rh/YN2012 ORF9 shares ~30% aa sequence identity with SARS-CoV ORF7a. Interestingly, BtCoV/Rh/YN2012 and SARSr-CoV were both detected in R. sinicus from the same cave. We suppose that SARS-CoV and BtCoV/Rh/YN2012 may have acquired ORF7a or ORF9 from a common ancestor through genome recombination or horizontal gene transfer. Whereas, ORF9 of BtCoV/Rh/YN2012 failed to induce apoptosis or activate NF-κB production, these differences may be induced by the divergent evolution of these proteins in different pressure.
Though different BtCoV/Rh/YN2012 ORF4a share <64.4% amino acid identity, all of them could activate IFN-β. ORF3a from RsYN1 and RaGD upregulated NF-κB, but the homologue from RsYN2 downregulated NF-κB expression. These differences may be caused by amino acid sequence variations and may contribute to a viruses’ pathogenicity with a different pathway.
Though lacking of intestinal cell lines from the natural host of BtCoV/Rh/YN2012, we screened the cell tropism of their spike protein through pseudotyped retrovirus entry with human, bat and other mammalian cell lines. Most of cell lines screened were unsusceptible to BtCoV/Rh/YN2012, indicating a low risk of interspecies transmission to human and other animals. Multiple reasons may lead to failed infection of coronavirus spike-pseudotyped retrovirus system, including receptor absence in target cells, failed recognition to the receptor homologue from non-host species, maladaptation in non-host cells during the spike maturation or virus entry, or the limitation of retrovirus system in stimulating coronavirus entry. The weak infectivity of RsYN1 pseudotyped retrovirus in Huh-7 cells could be explained by the binding of spike protein to polysaccharide secreted to the surface. The assumption needs to be further confirmed by experiments.
Our long-term surveillances suggest that Rhinolophus bats seem to harbor a wide diversity of CoVs. Coincidently, the two highly pathogenic agents, SARS-CoV and Rh-BatCoV HKU2 both originated from Rhinolophus bats. Considering the diversity of CoVs carried by this bat genus and their wide geographical distribution, there may be a low risk of spillover of these viruses to other animals and humans. Long-term surveillances and pathogenesis studies will help to prevent future human and animal diseases caused by these bat CoVs.