Dataset: 11.1K articles from the COVID-19 Open Research Dataset (PMC Open Access subset)
All articles are made available under a Creative Commons or similar license. Specific licensing information for individual articles can be found in the PMC source and CORD-19 metadata
.
More datasets: Wikipedia | CORD-19

Logo Beuth University of Applied Sciences Berlin

Made by DATEXIS (Data Science and Text-based Information Systems) at Beuth University of Applied Sciences Berlin

Deep Learning Technology: Sebastian Arnold, Betty van Aken, Paul Grundmann, Felix A. Gers and Alexander Löser. Learning Contextualized Document Representations for Healthcare Answer Retrieval. The Web Conference 2020 (WWW'20)

Funded by The Federal Ministry for Economic Affairs and Energy; Grant: 01MD19013D, Smart-MD Project, Digital Technologies

Imprint / Contact

Highlight for Query ‹Coronavirus symptoms

Re-emergent Human Adenovirus Genome Type 7d Caused an Acute Respiratory Disease Outbreak in Southern China After a Twenty-one Year Absence

Adenovirus identification and genome annotation

All 11 specimens collected were amplified “PCR-positive” for adenovirus and identified as HAdV-7 by type-specific PCR analysis. Of these, two, isolated from hospitalized and presumably more severe cases, produced visible CPE upon culturing. They were archived as DG01_2011 and DG02_2011. Sequence analysis revealed identical hexons, which were identified as HAdV type 7 by BLASTn analysis. The genome from DG01_2011 was then sequenced, assembled, annotated, and analyzed. Figure 1 presents the genomic organization and transcription map of DG01_2011. This genome contains 35,240 bp with a GC content of 51.08%. A total of 39 coding sequences were identified. These genome data, noted formally as “human adenovirus 7 strain CHN/DG01/2011/7[P7H7F7]” and in this report as “DG01_2011”, were deposited in GenBank (accession number KC440171).

Genome type determination of HAdV-7 strains DG01_2011, 0901HZ/ShX/2009, and CQ1198_2010

The genome type of DG01_2011 was determined by comparing its in silico REA profiles with other HAdV-7 genome types reported in the literature37111520. Although seemingly antiquated in comparison to genome sequencing, REA profiles are still useful for comparisons with unsequenced but previously reported genome types and strains, and also as rapid and less-expensive alternatives for large-scale characterizations of viruses given a correct reference strain. Using the genome type denomination of Li, et al.20, DG01_2011 is identified as HAdV-7d, evidenced by the REA patterns and identical with the first reported HAdV-7d25, as shown in Figure 2. The REA patterns generated from DG01_2011, 0901HZ/ShX/2009, which caused acute bronchitis and pneumonia in an ARD outbreak comprising 70 cases amongst young children in the Shaanxi Province in 2009, including one fatality24, and CQ1198_2010, which was associated with an epidemic in Chongqing, Southwestern China (Ni, K., et al., unpublished; JX625134), are identical to each other and also identical with those of HAdV-7d reported earlier in Israel (1998) and Japan (1999)2627. The REA patterns of CQ1198_2010 in this study provide evidence to amend the less-descriptive designation of “mutant HAdV-7d2” noted by Ni, K., et al. in the GenBank entry for CQ1198_2010 (JX625134). For reference, the in silico REA profile for the prototype Gomen HAdV-7 is provided; the REA patterns for these recent isolates differ clearly from the prototype, as shown in Figure 2. HAdV-7 prototype is the correct reference genome as three REA profiles, BclI, SalI, and XhoI, showed identical patterns and complement the REA patterns that differ, along with sequence similarities across the genome.

Phylogenetic analysis of hexon genes and whole genomes confirms the genome types

Phylogenetic analysis of 33 archived HAdV-7 hexon genes showed that DG01_2011 has an origin common to strains 0901HZ/ShX/2009, CQ1198_2010, Hebei_SJZ_2011, and TW_2011. These hexons form a subclade that is on the same branch with another subclade containing several non-China isolates, including HAdV-7d2 from the U.S., as shown in Figure 3A. The bootstrap value of 65 indicates the hexons from the China genomes are highly similar to each, but are separate from the U.S. HAdV-7d2 subclade (bootstrap value 82). Furthermore, the phylogenetic analysis of 16 available HAdV-7 whole genomes revealed DG01_2011, 0901HZ/ShX/2009, and CQ1198_2010 forming a subclade comprising HAdV-7d, and confirming the close relationships with each other, reaffirming a common lineage (Figure 3B) that is distinct from the HAdV-7d2 strains of the U.S.A. (bootstrap value 100). All of the genome types form subclades that are separate from the clade containing the prototype 7 (Gomen; 1952), with HAdV-7h forming a separate subclade in the genome phylogenetic analysis in contrast to the hexon gene phylogenetic analysis.

Comparative genomic analysis and single nucleotide differences of HAdV-7 strains causing ARD outbreaks in China

Comparative genomics analysis showed DG01_2011 has near genome identity with an earlier HAdV-7 isolate, 0901HZ/ShX/2009 (99.97%) and also with CQ1198_2010 (99.96%). Comparative genomics analysis documented seven single nucleotide substitution and one single base insertion differences between the DG01_2011 and 0901HZ/ShX/2009 genomes. Of these, two single nucleotide substitutions were localized in the ITRs and one non-synonymous substitution each was located in the DNA polymerase, penton base, and 34 kDa protein coding sequences (Table 1). One synonymous nucleotide substitution each was present in the 100-kDa hexon assembly-associated protein and Virus-Associated (VA) RNA II. The single nucleotide insertion was in a non-coding region of DG01_2011. There were three single nucleotide substitutions in coding sequences and seven base deletion differences in the ITRs between CQ1198_2010 and 0901HZ/ShX/2009 genomes. One synonymous substitution (C to T) was located in hexon assembly-associated protein (A254A) and the other two non-synonymous substitutions G to C and G to T were located in DNA polymerase (S55C) and 34 kDa protein (P87Q), respectively. The nucleotide deletions in ITRs of CQ1198_2010 may be sequencing errors given that the left ITR was not identical with the right ITR, or may represent recent mutations. If exclusive of ITR differences, there were only three single nucleotide substitutions between the CQ1198_2010 and 0901HZ/ShX/2009 genomes (99.99%). For strain DG01_2011, it had a higher genome identity with CQ1198_2010 (99.98%) than 0901HZ/ShX/2009 (99.97%) if exclusive of ITR difference. There were only four single nucleotide substitutions and one single nucleotide insertion in non-coding region between both genomes, which led to three non-synonymous substitutions in DNA polymerase (D1039E, S55C) and penton base gene (V239A), respectively.

Nucleotide substitution rates and selection pressures for HAdV-7d strain DG01_2011 major capsid protein genes

The selective pressures at the protein level for the three HAdV-7 capsid protein genes, hexon, penton base and fiber, were examined by comparing synonymous and non-synonymous mutations. All three genes have Ka/Ks ratios of less than 1 (Table 2). This is in accordance with the hypothesis that organismal evolution is dominated by negative selection, i.e., ones removing mutations harmful to fitness28. Specifically, both hexon and penton base genes have less non-synonymous substitutions per site, which leads to the low ratios of Ka/Ks. Although the non-synonymous substitutions and Ka/Ks ratio of the fiber gene is also low, it is relatively higher than for the hexon and penton base genes. This may indicate that the fiber gene has less negative selection pressures, likely due to tissue tropism being determined and constrained by the fiber gene. Overall, the majority of mutations are synonymous and do not affect the integrity of the hexon, penton base, and fiber proteins.

Genome recombination analysis of HAdV-7d

Genome recombination analysis using Simplot software29 reveals a lateral transfer of a small portion of the genome upstream of the penton base gene. This recombination contains the entire L1 52/55 kDa gene from HAdV-16 into HAdV-7d, as shown in Figure 4A. Its importance remains to be revealed. The gene transfer is also found in the genomes from the earlier strains CQ1198_2010 (Southwestern China; 2010; unpublished) and 0901HZ/ShX/2009 (Northwestern China; 2009)24, respectively, shown in Figure 4B, but not found in the prototype Gomen HAdV-7 genome, as displayed in Figure 4C.

Discussion

Among the two HAdV species B respiratory pathogens most frequently associated with ARD outbreaks globally, HAdV-7 is reported to cause a higher mortality rate than HAdV-3 in one long-term survey (1958–1990)20, as well as in a recent shorter term survey of adenoviral pneumonia cases in Beijing (2009–2011)19. Genome type HAdV-7d apparently originated and circulated in China from 1958–1990, becoming the predominant strain during the period of 1980–199020. It was also the prevalent genome type found in Korea during two outbreaks in 1995–1996 and 2001–2002, accounting for 98–100% all of the type 7 HAdV strains assayed30. Interestingly, despite reports of global circulation, HAdV-7, and in particular HAdV-7d, epidemics had not been reported in mainland China from 1990 to 2009. In 2009, HAdV-7 was identified as the respiratory pathogen in an outbreak that included a fatality in Shaanxi24 and also in a 2010 outbreak in Chongqing (unpublished), signaling a reemergence. Thorough characterization of these pathogens is evidenced by the availability of two genome sequences (JF800905 and JX625134), both of which are further identified as the HAdV-7d genome type in this report, and shown to be nearly identical to this report of an isolate from a 2011 ARD outbreak in Guangdong Province (strain DG01_2011) by comparative genomics and, in particular, in silico REA pattern analysis, as presented in Figure 2.

Although not ideal and largely replaced by whole genome sequencing, REA patterns can still provide rapid and relatively inexpensive characterizations of the genomes of large number of pathogens in an outbreak313233343536. For HAdV comparisons, the caveat is to use the correct reference genome; for example, HAdV-55 contains a partial hexon gene from HAdV-11, comprising approximately only 2.6% of the length of the genome, in a chassis of HAdV-14, comprising approximately 97.4% of the length of the genome37. Using the genome of HAdV-11 as a reference yields meaningless patterns that are subject to researcher-biased interpretations and leads to erroneous conclusions that HAdV-55 is a genome type of HAdV-11. Using the HAdV-14 genome as a reference provides a closer approximation of the genome identities438. However, the recombination event revealed by whole genome sequencing, with the conflicting “Trojan Horse” renal pathogen epitope observed with ARD symptoms, indicates this was a novel and emergent pathogen43739404142. In contrast, for HAdV-7d, the prototype HAdV-7 genome provides the correct reference: three REA patterns are identical (BclI, SalI, and XhoI); four are obviously different (BamHI, BglI, HpaI, and SmaI); and four are highly similar with a few differences in the band patterns (BglII, BstEII, HindIII, EcoRI, and XbaI), shown in Figure 2.

The major advantages of REA comparisons are the value and abundance of earlier molecular epidemiology studies, prior to the genome sequencing era, presenting REA data, and, in many cases, relating particular genome types to clinical, epidemiological, and pathogenicity observations. All of these historical strains are physically lost and no longer available for further genomic or laboratory characterization. In essence, however, the value and knowledge of the outbreaks, pathogens, and researchers of the past are not entirely lost if genomes of current pathogenic strains of interest may be compared with published REA patterns of past pathogens, as demonstrated in the genome type identities presented in this report.

Whole genome characterization of HAdV provides a higher-resolution perspective of understanding this pathogen, which may or may not lead to better public health strategies and measures to prevent outbreaks. As noted for two species B ARD pathogens, HAdV-4 and HAdV-7, “restricted use” but effective vaccines can be and are deployed currently in the U.S. military to prevent ARD outbreaks8910. However, even if there were no viable strategy to manage HAdV outbreaks, knowing the genome type, either by REA or by whole genome sequencing, allows an understanding of the epidemiology, including potential morbidity and mortality profiles, of the circulating pathogens.

As discussed earlier, genome types may have different pathogenicity, infectivity, and virulence profiles; for example, a higher mortality rate was reported for children infected by genome types 7d and 7l in Korea, with mortality rates of 18%, compared to 3.6% for HAdV-3 infections15. Another genome type, HAdV-7h, also resulted in more severe symptoms, including fatalities in South America23. For their molecular epidemiological studies of HAdV-7, Wadell and colleagues presented numerous REA patterns generated with restriction endonucleases (BamHI, BclI, Bgll, BglII, BstEII, HindIII, HpaI, SmaI, EcoRI, SalI, XbaI, and XhoI), parsing HAdV-7 isolates from various regions and across many years to divide them into more than 20 genome types1525434445.

Adenoviruses contain relatively stable double-stranded DNA genomes144647. There are seven single base substitutions and a one-base insertion between strains DG01_2011 and 0901HZ/ShX/2009, which led to three non-synonymous substitutions in the DNA polymerase, penton base, and 34 kDa protein coding sequences. Interesting, there are only three single base substitutions between strains CQ1198_2010 and 0901HZ/ShX/2009, exclusive of the nucleotide deletions in ITRs of CQ1198_2010 which may be due to possible sequencing errors. The high genome percent identity between strains CQ1198_2010 and 0901HZ/ShX/2009 and the adjacent locations of Chongqing and Shaanxi (394 kilometers apart) where strains CQ1198_2010 and 0901HZ/ShX/2009 were isolated indicate strain 0901HZ/ShX/2009 may be the origin of strain CQ1198_2010. Strain DG01_2011 has a higher genome identity with CQ1198_2010 than 0901HZ/ShX/2009, which also supports the hypothesis that strain CQ1198_2010 is be the ancestor of strain DG01_2011.

Although HAdV genomes appear stable in terms of single base changes, as expected for double stranded DNA viruses and as observed in pairs of HAdV genomes examined to date, e.g., the prototype versus circulating strains of HAdV-3 and -5, separated by approximately fifty years4647, less common but biologically and clinically significant larger genome changes are observed either as a single, small recombination event, such as the lateral transfer of the renal pathogen epsilon epitope (HAdV-11) providing a “Trojan Horse” effect to the recombinant HAdV-55, an emergent acute respiratory disease (ARD) pathogen in a putatively immune naive host population37, or as multiple and larger recombination events, such as the lateral transfer of the non-pathogen epsilon epitope (HAdV-D22) along with multiple other sequences to an emergent recombinant resulting in the highly contagious ocular pathogen causing epidemic keratoconjunctivitis (EKC), HAdV-D5348. Additionally, the presence of the epsilon epitope of a nonpathogenic type, HAdV-19, found in several recently reported emergent recombinant EKC pathogens, HAdV-64, support the hypothesis that recombination amongst HAdVs is an important mechanism driving the molecular evolution and genesis of HAdV pathogens49. In both of these latter examples, newly emergent HAdV pathogens have the “serotype” of nonpathogens but are potent, significant, and highly contagious human pathogens.

Recombination appears to play another novel and major role in the molecular evolution of HAdVs and genesis of human pathogens. Recent reports of HAdV genomes containing genome segments, including near-entire genomes, derived from simian adenoviruses (SAdVs) indicate zoonosis is an avenue of lateral gene transfer. Thus, nonhuman primates may be a wellspring of emergent human pathogens5051, and vice versa52.

A novel third type of lateral gene transfer is revealed in this newly reported genome of HAdV-7d strain DG01_2011, that of a “moderate-sized” single whole gene recombination. This serendipitous insight into the molecular evolution of these respiratory pathogens from HAdV species B demonstrates the genomes of individual HAdV types, such as type 7, contain changes revealed only by high-resolution genome sequences and may be important in the context of HAdV molecular evolution, viral fitness, origins and bases of clinical and pathogenicity differences, and account for emergent and re-emergent pathogens. BLAST analysis reveals the recombinant region to encode the entire L1 52/55 kDa gene of HAdV-B16 with flanking non-coding sequences. The BLAST scores indicate the first highly similar sequence, aside from several type 7 sequences, is that from the HAdV-B16 prototype (Max. score 2710, Total score 2710, Query cover 100%, E-value 0.0, Ident. 96%) and a HAdV-B16 recombinant (2687, 2687, 100%, 0.0. 96%), with additional homologous and highly similar sequences found in HAdV-B50 (2436, 2436, 100%, 0.0, 93%) and HAdV-B21 strains (2422, 2422, 100%, 0.0, 93%). This encodes a DNA-binding protein that is expressed in both the early and late stages of infection, suggesting it could play multiple roles in the adenoviral life cycle. The L1 52/55 kDa protein interacts with the IVa2 protein and is an essential protein that is absolutely required for DNA packaging as well5354. Effects of this particular moderate-sized recombination from HAdV-B16 into the HAdV-B7 genome chassis and the resultant emergent pathogen are unknown pending wet-bench investigations and additional clinical reports. The HAdV-7 prototype strain (AY594255)55 analyzed is also known as the Gomen strain, which was isolated as a clinical specimen from a throat washing of a U.S. military recruit with pharyngitis56. This strain is nearly contemporaneous with the Greider strain (AY594256)57, aka HAdV-7a, which was used to develop the vaccine strain1458. Although there are minor genome differences, e.g., point mutations, between the prototype and the vaccine strains, pairwise genome dot blot analysis (PipMaker) indicated no recombination events14.

These observations strongly support and validate the recent paradigm change of using the genome data along with biological and clinical profile changes to recognize, characterize, type, and name novel HAdVs rather than relying solely on the epsilon and/or gamma epitopes459, determined either by serology or imputed by limited DNA sequencing, in the past.

With the exception of sporadic HAdV-7 infections reported in children in Guangzhou (2011)60 and the three recent outbreaks, the apparent absence of type 7 ARD pathogen circulating in the population of Southern China before 2011 leads to a concern that the dense city populations in China are now immunogenically naïve with respect to HAdV-7. In Northern China, recently, 312 isolates were typed as HAdV-7 by PCR and sequencing of hexon genes from 848 HAdV-positive specimens during 2003–2012; HAV-7 was associated with most of the severe lower respiratory HAdV infections18. Coupled with increased opportunities for travel, a “Perfect Storm” for present and near-future outbreaks of the apparently more severe disease-causing HAdV-7d strains is foreboding.

In Chongqing (Southwestern China), 92 (48.17%) cases involving HAdV-7 were identified from children presenting with ARD during 2009–2012 by hexon sequencing40. Recently, there were two ARD outbreaks caused by HAdV-7, both of which occurred in military training camps, one in Shaanxi Province (Northwest China) from February to March of 201261, the other in Wuhan (Central China) from January to February of 201362. In the former outbreak, a total of 176 patients were sampled, with all of the patients being males, with ages between 16–34 years61. In the latter, 440 patients aged between 17–22 years were reported as afflicted with ARD62. In Taiwan, there was a large community outbreak of HAdV-7 in 201116. In this instance, an abrupt increase in percentage of HAdV-7 infections occurred, from 0.3% in 2008–2010 to 10% in 201116. The hexon nucleotide sequences of five HAdV-7 isolates collected in Taiwan were identical to the sequence of HAdV-7 strain 0901HZ/ShX/200916, which was also identical to DG01_2011. In the context of the data in this report, these “Taiwan” hexon genes formed the same subclade with strains 0901HZ/ShX/2009, CQ1198_2010 and DG01_2011 (Fig. 3A). Given that only the hexon genes were sequenced, the exact genome types of these strains in the two outbreaks remain unknown. However, the possibility of a HAdV-7d genome type circulating is foreboding. Further data, including complete genome sequencing and in silico REA, are important to confirm this possibility.

In the interest of global public health, with these recent outbreaks and the identification of nearly identical contemporary HAdV-7d genome types, we strongly urge molecular surveillance and genotyping of newly isolated HAdV strains in China by whole genome sequencing and/or in silico REA. Additionally, the newly-redeveloped vaccines, which are now only accessible to the U.S. military9, should be made available to the civilian “at-risk” public to prevent “preventable” highly contagious outbreaks involving HAdVs associated with high morbidity rates and fatalities5715161718192021222324252627304345606163. In particular, the vaccine against HAdV-7 is urgently needed in China, due to the apparent decades-absence of circulating HAdV-7, which presumably resulted in a corresponding lower level of herd immunity in today's population. Given the higher severity of diseases and fatality rates caused by HAdV-7, especially HAdV-7d, extensive surveillance and corresponding molecular investigation, including genotyping, genome typing, and genome sequencing, should be carried out when confronting outbreaks of HAdV pathogens in the high-density populations of China to protect the public and the global community.

Specimen collection and handling

During February 27 to March 6 of 2011, twenty-three primary school children under the age of 12 (Dongguan; Guangdong Province) presented with flu-like symptoms, including fever, pharyngalgia, and coughing as well as other indications of ARD. Two were hospitalized with severe symptoms. Eleven throat swab specimens were collected into 2-ml viral transport media; transported at 2°C–8°C; and preserved at −80°C for virus isolation and nucleic acids extraction. This study protocol was approved by the institutional ethics committee of the Center for Disease Control and Prevention of Guangdong Province (Guangdong CDC) and was carried out in accordance with the approved guidelines. The guardians of all under-aged participants gave signed informed consent for participation in the study. Data records of the samples and sample collection are de-identified and completely anonymous.

Detection of respiratory pathogens

Total nucleic acids were extracted from the specimens using the QIAamp minElute virus spin kit (Qiagen; Hombrechtikon, Germany). Human adenovirus, respiratory syncytial virus, influenza virus A and B, parainfluenza virus types 1–3, human rhinovirus, human metapneumovirus, and human coronavirus OC43 and 229E were detected by real-time PCR as described earlier63. For HAdV identification, type-specific primers were used to characterize the type by PCR, as described in an earlier report60.

Adenovirus isolation and genomic DNA extraction

Adenovirus-positive throat swab specimens, identified by PCR analysis, were inoculated into A549 cell cultures, and grown in Dulbecco's minimum essential medium supplemented with 100 IU penicillin ml−1, 100 mg streptomycin ml−1, and 2% (v/v) fetal calf serum, at an atmosphere of 5% (v/v) carbon dioxide. Cytopathic effect (CPE) was monitored for at least ten days. Viral genomic DNA was extracted from infected cells for genomic analysis, as described by Le, et al.64.

Genome sequencing and annotation

The genome of HAdV strain DG01_2011 was sequenced using a Sanger chemistry-based, primer-walking method by PCR-amplification, with overlapping regions sequenced3965. Both 5′- and 3′-ends (including both inverted terminal repeats) were sequenced directly by primers Ad7-LTRS1A (5′-GCCTCTTGACGGAACTCG-3′) and Ad14-LTRS2 (5′-GGTCCCTCTAAATACACATACA-3′), respectively, using genomic DNA as template; this ensured the accurate determination of the end sequences3965. The sequence data, collected with an ABI 3730 Genetic Analyzer, provided an average genome coverage of 3- to 5-fold, with both strands represented. Gaps and ambiguous sequences were PCR-amplified using different primers and resequenced. These sequencing ladders were assembled with the SeqMan Pro software 7.0.1 (DNASTAR, Inc.; Madison, WI. USA). Nucleotide and amino acid sequences were aligned with CLUSTAL and BLAST software. The genome sequence was annotated based on the previous annotation of HAdV-7 prototype strain (Gomen)55 and deposited into GenBank with the accession number KC440171.

In silico restriction endonuclease analysis (REA)

The specific adenovirus genome type was determined using in silico REA analysis of the whole-genome sequences in accordance with the in vitro protocol described by Li, et al20. This was performed using the software Vector NTI Advance 11.5 (Invitrogen Corp.; San Diego, CA. USA). Twelve restriction enzymes were used for this analysis, as performed by Li, et al.20: BamHI, BclI, Bgll, BglII, BstEII, HindIII, HpaI, SmaI, EcoRI, SalI, XbaI, and XhoI.

Phylogenetic analyses of HAdV-7 hexon genes and the whole genome sequences. The Molecular Evolutionary Genetics Analysis (MEGA) version 5.1.0 software was used for phylogenetic analyses of the HAdV-7 hexon genes and the whole genomes, with additional sequences retrieved from GenBank database, as described previously6667. Neighbor-joining phylogenetic trees with 1,000 boot-strap replicates were constructed using a maximum-composite-likelihood method with default parameters. Bootstrap numbers shown at the nodes indicate the percentages of 1,000 replications producing the clade, with a value of 80 noted as robust and significant.

Archived HAdV-7 genome sequences from GenBank were used for phylogenetic analysis. These are as follows (for reference, the names include the corresponding GenBank accession number, country of isolation, strain name, year of isolation (if available), and genome type (if available)): AY594255_Gomen_1952_7p, JX625134_CHN_CQ1198_2010_7d, JF800905_CHN_0901HZ/ShX_2009_7d, JX423388_USA_ak40_1997_7b, JX423386_USA_ARG/ak38_2003_7h, JX423387_USA_ak39_1997_7d2, JX423383_USA_ak35_2006_7d2, JN860677_USA_FS2154_2009_7d2, JN860679_JPN_Takeuchi_3+7_1958, JN860676_AR_87-922_1987_7h, GQ478341_CHN_GZ08_2008, HQ659699_CHN_GZ07_2007, AY594256_USA_vaccine_1962, AY495969_CHN_vaccine, AY601634_USA_NHRC_1315_1997, and KC440171_CHN_DG01_2011_7d.

The HAdV-7 hexon complete sequences used for these analyses are as follows: AB330088_Gomen_1952_7p, JN860679_JPN_Takeuchi_3+7_1958, AF065067_USA_55142_vaccine_1962_7a, AY594256_USA_vaccine_1962, AF515814_CHN_Beijing, AY495969_CHN_vaccine, JN860676_AR_87-922_1987_7h, AF053086_JPN_383_1992_7d, AF053087_JPN_bal_1995_7d2, AY769945_KR_95-81_1995_7d, JX423387_USA_ak39_1997_7d2, JX423388_USA_ak40_1997_7b, AY601634_USA_NHRC_1315_1997, AF053085_JPN_S-1058_1998_7a, AY769946_KR_99-95_1999_7l, AB243009_JPN_2003_7dx, AB243118_JPN_Osaka_2003_7dx, JX423386_USA_ARG/ak38_2003_7h, JX423383_USA_ak35_2006_7d2, HQ659699_CHN_gz07_2007, GQ478341_CHN_GZ08_2008, GU230898_CHN_0901HZ/ShX_2009_7d, JN860677_USA_FS2154_2009_7d2, JX625134_CHN_CQ1198_2010_7d, JQ360620_CHN_Hebei_1101/SJZ_2011, JQ360621_CHN_Hebei_1104/SJZ_2011, JQ360622_CHN_Hebei_1106/SJZ_2011, JX174426_TW_TW1494_2011, JX174427_TW_TW018_2011, JX174428_TW_TW019_2011, JX174429_TW_TW025_2011, JX174430_TW_TW237_2011, and KC440171_CHN_DG01_2011_7d.

Genome recombination analysis

The genomes of HAdV-B7d strains DG01_2011 (Guangdong Province, China; 2011), CQ1198_2010 (Southwestern China; 2010, unpublished), and 0901HZ/ShX/2009 (Northwestern China; 2009)24, along with the prototype Gomen genome were analyzed for sequence recombination events using the software tool Simplot (http://sray.med.som.jhmi.edu/SCRoftware/simplot/)29. For the recombination analysis, MAFFT software was used first to align the HAdV-B species sequences using default parameters (http://mafft.cbrc.jp/alignment/server/). Default parameter settings for the Simplot software were used for analyzing the whole genomes, along with the following input: window size (2000 nucleotides [nt]), step size (200 nt), replicates used (n 100), gap stripping (on), distance model (Kimura), and tree model (neighbor-joining). The following genomic sequences of HAdV-B members were used: HAdV-B7p (AY594255), HAdV-B3 (AY599834), HAdV-B16 (AY601636), HAdV-B21 (AY601633), HAdV-B50 (AY737798), HAdV-B11 (AY163756), HAdV-B34 (AY737797), HAdV-B35 (AY271307), HAdV-B14 (AY803294), and HAdV-B55 (FJ643676).

Substitution Rate analysis of the hexon, penton base and fiber genes in HAdV-7

The numbers of non-synonymous (Ka) and synonymous (Ks) substitutions per site from between sequences were noted and the Ka/Ks ratios were calculated. This HAdV-7 analysis was conducted using the Nei-Gojobori model68, and included nucleotide sequences from 33 hexon genes, 57 fiber genes, and 19 penton base genes available from GenBank. All positions containing gaps and missing data were eliminated automatically. Evolutionary analyses were performed with MEGA 5.1.066.

The HAdV-7 complete hexon, penton base and fiber gene sequences available in GenBank were achieved for analysis. The HAdV-7 complete hexon gene sequences used for this analysis are same with previous those in phylogenetic analysis. The following HAdV-7 complete fiber gene sequences were used: AY495969, AY594255, AY594256, AY601634, GQ478341, HQ659699, JF800905, JN860677, JX625134, GQ265864, GQ265865, GQ265866, GQ265867, GQ265868, GQ265869, GQ265871, GQ265872, GQ265873, HM057190, JQ410438, JQ410439, JQ410440, JQ410441, JQ410442, JQ410443, JX174431, JX174432, JX174433, JX174434, JX174435, KC456126, KC456127, KC456128, KC456129, KC456130, KC456132, KC456133, KC456134, KC456135, KC456136, KC456137, KC456138, KC456139, KC456140, KC456141, KC456142, AC_000018, KJ195467, KF268117, KF268125, KF268134, KF268135, JX423383, JX423386, JX423387, JX423388, and JX625134. The HAdV-7 complete penton base gene sequences used in this analysis: AY495969, AY594255, AY594256, AY601634, GQ478341, HQ659699, JN860677, JX625134, AC_000018, AD001675, KF268117, KF268125, KF268134, KF268135, JX423383, JX423386, JX423387, JX423388, and JX625134.

Author Contributions

Q.Z. and D.S. conceived and designed experiments. S.Z., C.W., C.K., J.S., S.D., L.Zou, J.Z., Z.C., S.J., Z.Z., J.Z., X.Wan, X.Wu, W.Z., L.Zhu, D.S. and Q.Z. performed the experiments and analyzed the data. S.Z., C.W., D.S. and Q.Z. wrote the manuscript. All authors reviewed the manuscript.