• Nenhum resultado encontrado

Essential Saccharomyces cerevisiae genome instability suppressing genes identify potential human tumor suppressors

N/A
N/A
Protected

Academic year: 2021

Share "Essential Saccharomyces cerevisiae genome instability suppressing genes identify potential human tumor suppressors"

Copied!
6
0
0

Texto

(1)

Essential

Saccharomyces cerevisiae genome instability

suppressing genes identify potential human

tumor suppressors

Anjana Srivatsana, Binzhong Lia, Dafne N. Sancheza, Steven B. Somacha, Vandeclecio L. da Silvab,c, Sandro J. de Souzab,c, Christopher D. Putnama,d, and Richard D. Kolodnera,e,f,g,1

aLudwig Institute for Cancer Research, University of California San Diego School of Medicine, La Jolla, CA 92093-0669;bBioinformatics Multidisciplinary

Environment, Instituto Metrópole Digital–Universidade Federal do Rio Grande do Norte, Natal, Brazil 59082-180;cInstituto do Cérebro–Universidade

Federal do Rio Grande do Norte, Natal, Brazil 59082-180;dDepartment of Medicine, University of California San Diego School of Medicine, La Jolla, CA

92093-0669;eDepartment of Cellular and Molecular Medicine, University of California San Diego School of Medicine, La Jolla, CA 92093-0669;fMoores Cancer

Center, University of California San Diego School of Medicine, La Jolla, CA 92093-0669; andgInstitute of Genomic Medicine, University of California San Diego

School of Medicine, La Jolla, CA 92093-0669

Contributed by Richard D. Kolodner, July 12, 2019 (sent for review April 23, 2019; reviewed by Marco Foiani and Wolf-Dietrich Heyer) Gross Chromosomal Rearrangements (GCRs) play an important

role in human diseases, including cancer. Although most of the nonessential Genome Instability Suppressing (GIS) genes in Sac-charomyces cerevisiae are known, the essential genes in which mutations can cause increased GCR rates are not well understood. Here 2S. cerevisiae GCR assays were used to screen a targeted collection of temperature-sensitive mutants to identify mutations that caused increased GCR rates. This identified 94 essential GIS (eGIS) genes in which mutations cause increased GCR rates and 38 candidate eGIS genes that encode eGIS1 protein-interacting or family member proteins. Analysis of TCGA data using the human genes predicted to encode the proteins and protein complexes implicated by theS. cerevisiae eGIS genes revealed a significant enrichment of mutations affecting predicted human eGIS genes in 10 of the 16 cancers analyzed.

genome instability

|

chromosome dynamics and replication

|

cancer

T

he genetic instability that occurs in many cancers is thought

to play a critical role in the development and progression of tumors and falls into 3 general categories (1, 2): accumulation of mutations resulting from environmental mutagens, defects in DNA mismatch repair genes, defects that reduce the fidelity of DNA polymerases, and increased levels of cytosine deaminases (3–6); accumulation of genome rearrangements such as trans-locations and copy number changes (2, 7); and accumulation of changes in chromosome number (8). Our understanding of the genes that suppress genome rearrangements in cancer comes from the study of inherited defects causing cancer susceptibility

syndromes such as Fanconi anemia and the BRCA1- and

BRCA2-defective breast and ovarian cancer syndromes (9, 10). In addition, cancer genome sequencing projects have identified mutations in candidate Genome Instability Suppressing (GIS) genes, most of which were identified in studies of model or-ganisms (11). However, our understanding of the causes of ge-nome rearrangements in mammalian cells is incomplete in part because it is difficult to perform genetic screens to identify and study GIS genes in mammalian cells.

Genetic studies in Saccharomyces cerevisiae have provided considerable insight into mechanisms that promote and prevent spontaneous genome rearrangements (12). Such studies were made possible by the development of quantitative genetic assays that allow measurement of the rate of accumulation of Gross Chromosomal Rearrangements (GCRs) (13–18) and allow de-tection of a diversity of types of GCRs (13, 14, 19–24). Overall, the types of genome rearrangements selected in GCR assays resemble those seen in human diseases, including cancer (dis-cussed in ref. 12). In addition, GCR assays have been used to identify genes that prevent GCRs and that alter the types of GCRs formed (13–17, 22, 25–34). These studies have shown that

a combination of oxidative defense, DNA replication machinery, DNA repair, cell cycle checkpoint, telomere maintenance, RNA processing, and chromatin modification/remodeling and assem-bly function in concert to prevent GCRs (12).

Most genes and pathways that prevent and form GCRs in model organisms have been identified through analysis of non-essential genes (12–15, 17, 22, 25, 26, 35, 36). These studies have identified 182 genes that suppress increased GCR rates and 438 cooperating GIS genes, in which mutations do not cause increased GCR rates but only cause increased GCR rates when combined with mutations in other genes. In contrast, studies of essential genes have thus far identified only 29 essential genes in which defects cause increased GCR rates (13, 14, 17, 25, 26, 29, 35, 37– 45). Here, we used 2 different GCR assays to screen a collection of temperature-sensitive (ts) mutants for mutations that cause in-creased GCR rates and identified 94 essential S. cerevisiae GIS (called eGIS) genes, of which 71 were not previously reported, as well as an additional 38 candidate eGIS genes, and analysis of The Cancer Genome Atlas (TCGA) data (46) demonstrated a signif-icant enrichment of mutations affecting 1 or more predicted hu-man eGIS genes in 10 of 16 cancers analyzed.

Results

A Genetic Screen to Identify Essential GIS Genes. To identify es-sential genes that suppress the formation of GCRs, we crossed

Significance

By performing a targeted genetic screen of temperature-sensitive mutations, this study identified 94 essential Saccha-romyces cerevisiae genome instability suppressing (eGIS) genes and 38 candidate eGIS genes. Analysis of The Cancer Genome Atlas data demonstrated that mutations in the human homo-logues of the S. cerevisiae eGIS genes were significantly enriched in 10 different human cancers. These results provide insights into the origin of genome instability in human cancers and provide tools for identifying and evaluating mutations that contribute to the development of cancer.

Author contributions: A.S., B.L., S.J.d.S., C.D.P., and R.D.K. designed research; A.S., B.L., D.N.S., S.B.S., V.L.d.S., S.J.d.S., C.D.P., and R.D.K. performed research; A.S., B.L., V.L.d.S., S.J.d.S., C.D.P., and R.D.K. analyzed data; A.S., S.J.d.S., C.D.P., and R.D.K. wrote the paper; and R.D.K. supervised the entire project.

Reviewers: M.F., Italian Foundation for Cancer Research and University of Milan; and W.-D.H., University of California, Davis.

The authors declare no conflict of interest. Published under thePNAS license.

1To whom correspondence may be addressed. Email: rkolodner@ucsd.edu.

This article contains supporting information online atwww.pnas.org/lookup/suppl/doi:10. 1073/pnas.1906921116/-/DCSupplemental.

Published online August 13, 2019.

GENET

(2)

query strains containing either the duplication-mediated GCR (dGCR) assay or the short repeat-sequence-mediated (sGCR) assay (SI Appendix, Fig. S1A) with 412 ts mutants [tsV6 (47); provided by Charlie Boone] and aleu2Δ::kanMX4 control strain (11). The 412 ts mutations analyzed affected 248 genes involved in DNA replication, DNA damage response and repair, telomere maintenance, chromatin modification and remodeling, and chromosome cohesion, condensation, and segregation, as well as related pathways implicated in maintaining genome stability, including sumoylation, cell cycle, mitosis and cytokinesis, tran-scription, and nuclear envelope and nucleo-cytoplasmic trans-port (Dataset S1). We recovered GCR-assay containing progeny for 399 of the 412 ts mutant alleles (243 of the 248 genes) in crosses with at least 1 and usually both query strains.

The progeny were scored using a papillation assay (11) at 30 °C and 25 °C (Fig. 1A). The number of papillae growing on medium that selected for GCRs was converted to a patch score ranging from 0 to 5 (11). At 25 °C and 30 °C, theleu2Δ control strain had average patch scores of 1.69 and 1.19, respectively, in the dGCR assay, and 0.38 and 0.30, respectively, in the sGCR assay (Fig. 1 B and C and SI Appendix, Fig. S1 B and C). To minimize false-positive and false-negative identification of GIS genes (11), we used a cutoff score difference of 0.4 above theleu2Δ control score at each temperature and identified 134 alleles of 103

genes with increased patch scores in at least 1 assay at least at 1 temperature. Some mutations appeared to cause assay-specific increased GCR patch scores; however, analysis that is beyond the scope of the current study will be required to verify this. The ts alleles affecting genes in the categories of DNA replication, chromosome cohesion, condensation, segregation, and other had the largest effect on patch scores (Fig. 1D andDataset S1).

Growth defects can potentially result in decreased GCR scores in papillation assays. We therefore validated mutations affecting each pathway implicated by the papillation assays by measuring quantitative GCR rates (Dataset S1). This excluded 18 alleles that caused elevated patch scores but not increased GCR rates. This also identified 12 alleles that caused increased GCR rates but not elevated patch scores. Finally, we sequenced all alleles of interest and eliminated 10 strains lacking the expected mutations (SI Appendix, Table S1). In total, we identified 121 ts alleles representing 94 eGIS genes (Dataset S1).

Definition of the eGIS1 and eGIS2 Gene Lists.The 94 essential genes identified defined the first version of the eGIS gene list (eGIS1; Dataset S1), which included genes related to DNA replication and the sister chromatid cohesion, chromosome condensation, and chromosome segregation pathways. Seventy-one of the eGIS1 genes had no previously known role in suppressing GCRs. Examples includeIPI1, which has a role in ribosome biogenesis and is potentially a component of the prereplication complex (48, 49), andSLD3, which is required for the activation of the MCM replicative DNA helicase (50). Other newly identified eGIS1 genes encoded cell cycle-related proteins (such as Apc4, component of the anaphase-promoting complex); Cdc15 (re-quired for mitotic exit); Cdc4 (F-box protein re(re-quired for cell cycle transitions); Cdc34 (ubiquitin-conjugating enzyme); Cdc37 (chaperone needed for passage through START during the cell cycle); 20S and 26S proteasome-related proteins Pre2, Pre6, Pre10, Rpn5, and Rpt6; and kinetochore and spindle-related proteins Mif2, Nnf1, Nuf2, and Spc42.

Seven previously implicated genes (DNA2, POL30, SPN1, SUA7, TAF4, TOA1, and TFG1) were not present in the eGIS1 list (26, 39, 41, 45): 5 genes (POL30, SUA7, TAF4, TOA1, and TFG1) were not represented in the mutation collection screened here, and 2 genes (DNA2, SPN1) were identified as GIS genes using alleles that were not tested here (dna2-2, TET-SPN1). Because of the lack of alleles in our mutation collection affecting all subunits or pathway components implicated by the eGIS1 genes and the potential for allele-specific effects on GCR rates, we created the expanded eGIS2 list (Dataset S1), which addi-tionally included 38 genes encoding other components of the complexes and pathways defined by the eGIS1 genes. To be stringent, we only added genes encoding components of com-plexes in which a high proportion of the genes encoding the complex were identified as eGIS1 genes; for example, we iden-tified 4MCM genes as eGIS1 genes and added the 2 remaining MCM genes.

Identification of Mutations That Activate the DNA Damage Response. The formation of GCRs involves aberrant processing of dam-aged chromosomes (12). To distinguish between eGIS gene mutations that directly or indirectly disrupt normal DNA dam-age processing from those that increase the levels of damdam-aged

chromosomes, we introduced the HUG1-GFP reporter, whose

expression is induced by activation of the DNA damage and replication checkpoints (51), into the starting 413 mutant strains. We measured the fold-increase in Hug1-GFP levels at 25 °C and 30 °C, using FACS (Fig. 2A and ref. 52). Multiple mutations caused dramatically increased Hug1-GFP levels (Fig. 2B), in-cluding those affecting DNA replication complexes (DNA po-lymerase alpha/primase, origin recognition, replication factor A, the GINS complex, and Okazaki fragment maturation), cell-cycle complexes (the RENT complex, the anaphase promoting complex/ cyclosome, and cyclin-dependent protein kinase), and chromo-some packaging complexes (cohesin, the cohesin loader, and DNA

ReplicationDNA

Damage

ResponseChromosomePackaging,

Segregation

Cell Cycle

Nucleo-cytoplasmicTransport

Chromatin Other

Patch Score Dif

ference (all assays)

-1 0 1 2 3

D

25 oC 30 oC Control orc2-4 orc4-ph pol1-1 psf2-ph sld3-ph

A

B

C

100 0 20 40 60 80 number of mutations 25˚C dGCR 0 1 2 3 4 5 30˚C dGCR

Strain patch score

0 1 2 3 4 5 100 0 20 40 60 80 number of mutations

Fig. 1. Identification of essential genome instability suppressing (eGIS) genes in S. cerevisiae. (A) Example patches of haploid strains containing the dGCR assay at permissive (25 °C) and semipermissive (30 °C) temperatures after replica plating onto GCR-selecting media. (B and C) Histograms of average strain patch scores for the dGCR assay at 25 °C (B) and 30 °C (C). The triangle indicates the position of the average patch score of the control (leu2Δ) strain. (D) Beehive plot of the difference in average patch score for each mutant strain relative to the control strain for the dGCR and sGCR as-says measured at 25 °C and 30 °C (Dataset S1); an increase of 0.4 (horizontal line) was previously established as the cutoff for significance (11). Mutant strains were classified by the function of the affected gene.

(3)

the Smc5-Smc6 complex). In addition, defects in sumoylation (“other” category) were also identified.

We divided the mutant strains into 5 categories (Fig. 2C and Dataset S1): those that did not cause increased Hug1-GFP levels (class A, 192 mutations), those that caused increased Hug1-GFP levels only at 25 °C (class B, 28 mutations), at both 25 °C and 30 °C (class C, 76 mutations), only at 30 °C (class D, 80 muta-tions), or those for which data were only available at 1 temper-ature due to growth defects (class E, 10 mutations). Class C mutations included those whose Hug1-GFP levels were relatively temperature-insensitive (class C1, 25 mutations) and those whose Hug1-GFP levels were higher at 30 °C relative to 25 °C

(class C2, 51 mutations). Class E mutations included those with data available only at 25 °C (class E1, 7 mutations) or at 30 °C (class E2, 3 mutations). Different ts mutations affecting the same protein or complex belong to different classes; for example, the 6 POL1 alleles belonged to classes A, C1, C2, and D. Increased Hug1-GFP levels were observed in at least 1 temperature for 50.3% of the alleles (142 of 244 genes), and 33.9% of the alleles (106 of 244 genes) had increased Hug1-GFP levels at 30 °C compared with 25 °C. More than 50% of the tested mutations affecting DNA replication, chromosome packaging and segre-gation, cell cycle, and nucleo-cytoplasmic transport categories caused increased Hug1-GFP levels (Fig. 2C); the relatively low proportion of ts mutations affecting the DNA damage category that caused increased Hug1-GFP levels is likely because many of the affected genes play modest roles in promoting DNA repair. We tested the effect of temperature on the accumulation of GCRs for a subset of the 27 C2 and D mutations with at least a 3-fold increase in Hug1-GFP levels at 30 °C compared with that at 25 °C (Dataset S1andSI Appendix, Table S2). Thepol1-1 and cdc9-1 mutants had a dramatic increase in GCR rate when shifted from 25 °C to 30 °C. In contrast, themcm10-1 GCR rate

was not temperature dependent, and smc1-259 and nuf2-61

mutants had reduced GCR rates at the higher temperature. Thus, temperatudependent changes in the DNA damage re-sponse as measured by Hug1-GFP levels did not always correlate with temperature-dependent increases in GCR rates. This may be because the formation of GCRs requires both the generation and misrepair of DNA damage, and in some mutants the in-creased DNA damage may not be repairable, resulting in no change or even reduced changes in GCR rates or cell death. Regardless, there was a strong correlation in general between defects causing increased levels of Hug1-GFP and those causing increased accumulation of GCRs (Fig. 3).

Mutation of eGIS Genes in Human Cancers. To examine whether eGIS genes were inactivated in human cancers, we first generated the human eGIS1 (heGIS1) gene list (115 genes;Datasets S1and S2), which corresponded to human homologs of theS. cerevisiae eGIS1 genes, and the heGIS2 gene list (162 genes; Datasets S1 andS3), which contained the human homologs of theS. cerevisiae eGIS2 genes and 2 additional human genes (CDCA5 and POT1) that lacked S. cerevisiae homologs but encoded proteins that function in the pathways identified by theS. cerevisiae analysis. We then identified the mutations in these genes in the TCGA data for 16 different cancers (Datasets S2–S5) and computationally ana-lyzed the heGIS1 and heGIS2 gene lists for significant enrichment of mutations (Fig. 4A; see Methods). The mutations included in the analysis were loss-of-function (LOF) mutations (nonsense mutations, frameshift insertions/deletions, and splice-site muta-tions) or LOF plus missense mutations (Datasets S2–S5 andSI Appendix, Table S3); data from analysis of specific classes of mutations are present in Datasets S2 and S3and SI Appendix, Tables S4–S12. We also calculated S-scores for each heGIS1 and heGIS2 gene and performed enrichment analysis (11).

Significant enrichment of mutations in a broad array of human eGIS genes were observed in 9 of the 16 TCGA cancers ana-lyzed: bladder urothelial carcinoma, colorectal adenocarcinoma, glioblastoma multiforme (GBM), kidney renal clear cell carci-noma, acute myeloid leukemia (LAML), low-grade glioma, sar-coma, stomach adenocarcinoma, and uterine corpus endometrial

carcinoma (UCEC; Fig. 4 and SI Appendix, Tables S3–S12).

Enrichment in GBM was specific to heGIS1, and enrichment in kidney renal clear cell carcinoma and sarcoma was specific to heGIS2. Among these 9 cancer types, bladder urothelial carci-noma samples had the greatest incidence of mutations in heGIS genes (SI Appendix, Fig. S2), and LAML, a cancer that has limited genome instability, had the lowest incidence of mutations in the expanded heGIS2 gene set and among the lowest incidence in the heGIS1 gene set. This is consistent with our previous study, in which LAML was not significantly enriched in mutations in non-essential GIS genes (11). Breast invasive carcinoma (BRCA) only

A

B

C

90 80 70 60 50 40 30 20 10 0 100 Percent of mutations A B C1 C2 D E1 E2 Mutation class: 0 30 60 90 120 0 0.5K 1.0K 1.5K 0 300 600 900 1200 Control (fold=1) pol1-12 (fold=0.7) pri1-m4 (fold = 36) Count 0 -103 103 104 105 GFP-A 0 -103 103 104 105 GFP-A 0 -103 103 104 105 GFP-A Cell cycle Chromatin DNA damage response

Other DNA replication Chromosome packaging Nucleo-cytoplasmic transport 0.2 1.0 5.0 20.0 Fold change 30 C 25 C Hug1-GFP 30 ˚C Hug1-GFP 25 ˚C 0 100 200 300 Relative Hug1-GFP level 10.0 2.0 1.0 0.5 Mutation index Cell cycle Chromatin DNA damage response

Other DNA replication

Chromosome packaging

Nucleo-cytoplasmic transport

All

Fig. 2. Monitoring induction of the DNA damage response by FACS using the Hug1-GFP reporter. (A) Example histograms of Hug1-GFP levels (GFP-Area signals) as a function of the number of FACS events for the control (leu2Δ), pol1-12, and pri1-m4 strains at 30 °C. Fold changes are the mutant mean GFP-Area signal divided by the control mean GFP-Area signal. (B, Top) Summary of the changes of Hug1-GFP levels for the mutant strains at 25 °C (red) and 30 °C (black) rank ordered by the change at 30 °C. (B, Middle) Ratio of the fold changes at 30 °C relative to 25 °C shows that ts allele-containing strains with the highest Hug1-GFP levels at 30 °C tend to be induced by in-creased temperature. (B, Bottom) Position of ts alleles affecting different processes in the rank ordered list indicated by vertical lines; DNA replication, chromosome packaging, and segregation, cell cycle, and other categories dominate the alleles with the highest Hug1-GFP levels at 30 °C. (C) Distri-bution of the ts alleles for different processes based on the Hug1-GFP levels (class A= no increase, class B = increase at 25 °C, class C1 and C2 = increase at 25 °C and 30 °C, class D= increase at 30 °C, class E = missing data; see Identification of Mutations That Activate the DNA Damage Response).

GENET

(4)

showed significant enrichment for mutations in cohesion genes. In contrast to these 10 cancers, no enrichment for mutations was observed with either gene list in head-neck squamous cell carcinoma, lung adenocarcinoma, lung squamous cell carcinoma, ovarian serous cystadenocarcinoma, prostate adenocarcinoma, and skin cutaneous melanoma. We also found that there was significant enrichment for genes with S-score>2 and >3 among the heGIS1 genes in 11 cancers and among heGIS2 genes in 12 cancers (SI Appendix, Table S13); as expression changes are an important component in the S-score calculation (11), this suggests there may be overexpression of individual GIS genes in different cancers, although we did not analyze this.

Overall, genes encoding replication-related complexes, the cohesin and condensin complexes, and chromatin- and transcription-related complexes were among the most frequently mutated genes in the 10 cancers that showed enrichment for mutations in the com-plete heGIS gene lists or a subset of heGIS genes (Fig. 4B). For example, 11.7% of bladder urothelial carcinoma samples contained a mutation or mutations in STAG2, and NIPBL mutation or mu-tations were observed in 19.8% of UCEC samples, 8.5% of stomach adenocarcinoma samples, 4.5% of low-grade glioma samples, and

1.6% of BRCA samples (Datasets S2–S5). In BRCA, GBM, and LAML, the overall enrichment was entirely attributable to mutations in cohesin genes. Other frequently mutated genes includedMTOR (mammalian target of rapamycin),SETX (senataxin), FBXW (F-box/ WD-repeat containing protein 7), andPOLE (catalytic subunit of

DNA polymerase epsilon). Both FBXW and POLE contain

re-current missense mutations in addition to other loss-of-function mutations; the recurrent POLE mutations cause defects in the proofreading exonuclease activity (53). Given the importance of the SMC5/6 complex in DNA repair, it was surprising that mu-tations in genes encoding the SMC5/6 complex were only signifi-cantly enriched in 1 cancer type (UCEC) (SI Appendix, Table S3). Discussion

Here, we performed a targeted screen for mutations in essential genes that result in increased GCR rates. We focused on mu-tations affecting genes involved in DNA and chromosome me-tabolism, cell cycle regulation, and other processes previously implicated in maintaining genome stability (12). These mutations were screened for those causing increased GCR rates at 2

Hug1-GFP GCR

Complexes and individual proteins

DNA replication DNA damage response Chromosome packaging, segregation Cell cycle Nucleo-cytoplasmic transport Other Chromatin Abf1 Cdc8 Dna2 Guk1 TORC1 complex Tap42-Rrd1-Sit4 Tap42-Rrd1-Pph21 22 COMA complex Mps2-Bbp1 Mitotic checkpoint Chromosome passenger

Gamma tubulin (small) Central kinetochore Gpn2 Pds5 Sfi1 Sgt1 Multiple CDK’s Cak1 RENT complex Dbf2-Mob1 Prp19-associated Importin complex TREX complex Brr6 Gsp1 Ntf2 Pse1 Yrb1 Shared remodeling subunits Act1, Arp4, Swc4

FACT complex Stp6 Cia2 Dbp5 Mcm1 Spn1 Spt14 Yah1 Rsp5-Bul1 2 TFIIK ORC complex MCM complex Dbp11-Sld2-Sld3 DNA polymerase DNA primase DNA polymerase DNA polymerase GINS complex Dbf4-Cdc7 Cdc6 Cdc9 Cdc21 RPA complex RFC complex Mcm10 TORC2 complex Nuclear cohesin Nuclear condensin Scc2-Scc4 Smc5-Smc6 complex Ndc80 complex Separase Securin Spindle pole body DASH complex Eco1 APC C complex Cdc15 Cdc37 Nuclear pore Brl1 RSC complex ASTRA complex Shared remodelling subunits mRNA cleavage factor CF I IA

Piccolo NuA4 complex Tah11 CST complex SCF complexes Proteosome Bur1-Bur2 TFIIH TFIID Smt3 (SUMO) Ulp1 Rix1 complex Pkc1 Mis12/MIND comlex Mif2 Rrs1 Rap1 Sen1 Sln1 Ubc9 41 45 9

Fig. 3. Comparison of mutations causing increased Hug1-GFP signaling and increased GCR rates. Alleles causing increased Hug1-GFP were mapped to protein complexes or individual proteins on the basis of the affected genes. The resulting 95 complexes/proteins were divided into those only affecting Hug1-GFP signaling, those only affecting GCR rates, or those affecting both. The complexes/proteins were then color-coded by biological process.

BLCA BRCA COADREAD

GBM HNSC KIRC LAML LGG LUAD LUSC OVPRAD SARC SKCM ST AD UCEC Number of Mutations 0 100 200 300 400 500

Observed Simulation average and

interquartile range

A

B

heGIS1 (blue) heGIS2 (red) **** **** **** **** **** *** ** * **** **** *** ** ** ** * 0 50 100 150 200 250 300 PBRM1TRRAPFBXW7NIPBLMT OR POLESETX SMARCA4 STAG2 SMARCA2

STAG1NUP98PDS5BESPL1 SMARCC2NCAPD3

TICRRPCF1 1

SMC1ASMC4TOPBP1SMC1BSMC2CUL1RFC1POLA1SMC3ERCC2PRKCBPOLD1PDS5ASMC5

Number of Mutations

LOF mutations Missense (Ndamage =5) Missense (Ndamage 5)

BLCA + BRCA + COADREAD + GBM + KIRC + LAML + LGG + SARC + STAD + UCEC (n=4376)

[BRCA is enriched for cohesin gene mutations]

Fig. 4. Analysis of TCGA data for mutations in human homologs of S. cer-evisiae eGIS genes. (A) Summary of the simulations to determine whether human homologs of the eGIS1 and eGIS2 genes are significantly mutated in cancers sequenced by the TCGA. Solid circles are the observed number of loss-of-function and missense mutations for the heGIS1 and heGIS2 gene lists. The box and whiskers correspond the average and interquartile range from the in silico simulations. Statistically significant P values are indicated by the number of asterisks (4= P < 0.0001, 3 = P < 0.001, 2 = P < 0.01, 1 = P < 0.05). All significant P values were above a false-discovery rate of 0.05 as determined by the Benjamini-Hochberg procedure. (B) Count of the number of mutations in the top 50 mutated heGIS2 genes from the 9 cancers with significant levels of mutations and BRCA, which is enriched for cohesin gene mutations.

(5)

different temperatures using 2 different assays that detect differ-ent but overlapping types of GCRs. Use of the 2 assays and 2 different temperatures increased the number of tests performed, and hence the sensitivity of the screen, and allowed identification of mutations that preferentially affect specific types of GCRs (15). This screen identified 121 mutations in 94 essential GIS genes, 71 of which have not been previously reported. The 94 eGIS genes encode 47 multiprotein complexes (78 genes) and 16 proteins that function as individual proteins (16 genes); this list of genes is re-ferred to as eGIS1. In many cases, 1 or more genes encoding a protein complex were identified as GIS genes, but not all of the genes encoding the complex were identified as GIS genes. The most likely explanation for this is that our mutation collection did not contain a sufficient number of mutations in each gene to allow identification of all possible mutations that could cause a defect resulting in increased GCR rates. We therefore constructed the eGIS2 gene list comprising 132 genes, which also contained other genes encoding protein complexes implicated in suppressing GCRs that were not present in the eGIS1 list (seeDefinition of the eGIS1 and eGIS2 Gene Lists). In combination with the previously iden-tified nonessential GIS genes (11), the genes in eGIS1 and eGIS2 identify 266 and 304 GIS1 and GIS2 genes, respectively.

The eGIS genes identified have implicated a number of met-abolic processes in the suppression of GCRs. The most prominent group of genes identified was those encoding DNA replication factors; these included most replication proteins including ORC, the MCM helicase, GINS proteins, all the DNA polymerases, DNA primase, RFA, RFC, and various proteins involved in control of DNA replication. This is consistent with previous studies implicating replication errors in the production of GCRs and studies showing that down-regulation of DNA polymerases can result in increased formation of GCRs (13, 54). A second prominent group of GIS genes were those encoding cohesin, condensin, and the Smc5-Smc6 cohesion complex; the latter has been extensively studied in regard to its role in DNA repair (29) and plays a role in suppressing GCRs (29, 40). Among other functions, cohesin acts during DNA repli-cation to maintain sister chromatid cohesion, which plays a role in sister chromatid recombination (55), a process suggested to suppress GCRs (12, 26). Condensin plays a role in maintaining the 3D structure of chromosomes (55), but how this functions to suppress GCRs is not clear; regardless, condensin defects caused similar in-creases in GCR rates as those caused by defects in cohesin and the Smc5-Smc6 cohesin complex (this study and refs. 29 and 40). We also found a modest number of mutations affecting the proteasome (56), sumoylation pathways (35, 57), chromatin remodeling com-plexes (52, 58), and nuclear pore (59), all of which act in DNA repair and the DNA damage response to some extent. Among the more surprising GIS genes identified were those encoding the TORC2 complex that may play a role in DNA damage responses (60); Senataxin, which acts in R-loop regulation (61); and Pkc1, which plays a minor role in DNA damage checkpoint responses (62).

To better classify the eGIS genes, we included a Hug1-GFP downstream transcriptional reporter of the DNA damage re-sponse in our screen (51, 52). This allowed the identification of 3 classes of genes. The first class was those in which defects caused increased Hug1 expression but no increase in GCR rates; defects in these genes are likely to directly affect the transcription of DNA damage response genes and are unlikely to reflect processes that suppress GCRs or possibly cause increased levels of DNA damage that is repaired in ways that do not result in the formation of GCRs. The second, and most prevalent, class was those in which defects caused both increased Hug1 expression and increased GCR rates and encompassed virtually all of the replication and cohesin/condensin genes. It seems likely that defects in replication genes result in increased levels of DNA damage because of rep-lication errors (13, 54), whereas defects in cohesin and possibly condensin genes result in reduced rates of DNA repair (29, 55), both of which might result in higher steady state levels of DNA damage; in both cases, misrepair results in GCRs. The third class was those in which defects cause increased GCR rates but no increase in Hug1 expression. It is likely that defects in these latter

genes either cause defects in the DNA damage response (e.g., PKC1) (62) or defects in DNA repair that result in increased GCRs without increases in the steady state levels of DNA damage. Previous studies have suggested that defects in the human homologs of nonessential GIS genes are prevalent in cancers that have increased genome instability (11, 63). Here, by analyzing TCGA data for 16 cancers, we identified 9 cancers with significant enrichment for LOF mutations and/or LOF plus missense muta-tions in the eGIS genes and 1 cancer (BRCA) in which there was only enrichment for mutations in the cohesin complex-encoding genes. Of the 9 cancers with enrichment for mutations in the eGIS genes, in 2 cases (GBM and LAML), this enrichment was only a result of the inclusion of the cohesion genes, whereas in the other 7 cancers, the enrichment was not attributable to any specific subset of eGIS genes. Furthermore, among the 9 cancers with general enrichment for mutations in the eGIS genes were 4 cancers with enrichment for mutations in chromatin remodeling/modifi-cation genes, 6 cancers with enrichment for mutations in cohesin genes, 2 cancers with enrichment for mutations in condensin genes, 1 cancer with enrichment for mutations in Smc5-Smc6 cohesin genes, and 4 cancers with enrichment for mutations in replication genes. This distribution of classes of mutated genes was also reflected among the top most frequently mutated genes in the 10 cancers with some evidence for enrichment for mutations in dif-ferent eGIS genes (Fig. 4B). Defects in cohesin-encoding genes and condensin-encoding genes in cancer have been reported pre-viously, although our analysis identified mutations in additional cancers, including GBM, low-grade glioma, and UCEC (cohesin) and stomach adenocarcinoma and UCEC (condensin) beyond those in which defects have been previously reported (64). We also observed mutations in a broad diversity of genes encoding proteins that function in DNA replication beyond the more generally ob-served mutator mutations affecting DNA polymerases delta and epsilon, supporting the idea that DNA replication errors could contribute to increased genome instability in cancer (53). Overall, our results support the view that eGIS genes are significantly mutated in a number of cancers. The mutations observed could be reduced-function mutations that in some cases might be dominant, loss-of-function mutations that could cause haploinsufficiency, or gain-of-function dominant mutations. Additional studies will be required to determine the frequency and nature of defects in es-sential and noneses-sential GIS genes in different cancers and whether these actually cause genome instability in these cancers. Methods

S. cerevisiae Strains. The dGCR and sGCR query strains used for systematic mating, RDKY7635 (dGCR; MATα hom3-10 ura3Δ0 leu2Δ0 trp1Δ63 his3Δ200 lyp1::TRP1 cyh2-Q38K iYFR016C::PMFA1-LEU2 can1::PLEU2-NAT

yel072w::CAN1/URA3) and RDKY7964 (sGCR; MATα hom3–10 ura3Δ0 leu2Δ0 trp1Δ63 his3Δ200 lyp1::TRP1 cyh2-Q38K iYFR016C::PMFA1-LEU2 can1::PLEU2-NAT

yel068c::CAN1/URA3), were described previously (11). The query strain RDKY8174 (MATα hom3-10 ura3Δ0 leu2Δ0 trp1Δ63 his3Δ200 lyp1::TRP1 cyh2-Q38K iYFR016C::PMFA1-LEU2 can1::PLEU2-NAT yel072w::CAN1/URA3

HUG1-EGFP.hphNT1) was used to introduce the HUG1-GFP reporter. Systematic Strain Construction and Screening. The dGCR and sGCR query strains were crossed to selected strains from the tsv6 temperature-sensitive mutant collection (BY4741 MATa strains; obtained from Charles Boone, Donnelly Centre, University of Toronto, Toronto, ON, Canada), using a RoTor instrument (Singer). Systematic mating, sporulation, haploid selection, GCR patch tests, and GCR rates were performed as described (11), except the crosses were performed at 25 °C and GCR patch tests and rates were per-formed at 25 °C and/or 30 °C, as indicated. To verify expected mutations in strains of interest,∼1-kb overlapping fragments spanning the genes of in-terest were amplified from 2 independent colonies from the cross progeny and subjected to Sanger sequencing at a commercial facility.

Measurement of Hug1-GFP Levels. Hug1-GFP abundance in log phase cells was measured at 25 °C and 30 °C using FACS, as described (52), using the fol-lowing modifications: Cells were grown and processed in 96-well plates and analyzed using a BD LSR Fortessa analytical cytometer with an HTS loader. Excitation was at 488 nm, and the fluorescence signal was collected through

GENET

(6)

a 505-nm long-pass filter and a HQ510/20 band-pass filter (Chroma Tech-nology Corp). For each sample, 30,000 events were recorded. The mean value of GFP abundance was calculated using FlowJo software and nor-malized to the mean GFP value in wild-type cells.

Analysis of Cancer Genomics Data. TCGA data were obtained from the cBIO portal (http://www.cbioportal.org). Simulations to determine statistical sig-nificance were performed as described (SI Appendix, Methods) (11). Pre-diction of the functional impact of missense mutations was determined as

described (SI Appendix, Methods) (11), using 9 prediction tests (Datasets S4–S6); to be predicted deleterious, a missense mutation had to be scored as delete-rious in at least 5 tests (Ndamage≥ 5), as this cutoff captured known recurrent missense mutations in POLE, FBXW7, KAT8, MTOR, and SETX.

ACKNOWLEDGMENTS. We thank Dr. Charlie Boone for the ts mutants. This work was supported by NIH grant R01GM26017 (to R.D.K.), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (Brazil) grant (23038.004629/ 2014-19 to S.J.d.S.), and the Ludwig Institute (R.D.K., C.D.P., and S.J.d.S.).

1. L. A. Loeb, A mutator phenotype in cancer. Cancer Res. 61, 3230–3239 (2001). 2. B. Vogelstein et al., Cancer genome landscapes. Science 339, 1546–1558 (2013). 3. S. Nik-Zainal et al.; Breast Cancer Working Group of the International Cancer Genome

Consortium, Mutational processes molding the genomes of 21 breast cancers. Cell 149, 979–993 (2012).

4. C. Palles et al.; CORGI Consortium; WGS500 Consortium, Germline mutations affecting the proofreading domains of POLE and POLD1 predispose to colorectal adenomas and carcinomas. Nat. Genet. 45, 136–144 (2013).

5. A. de la Chapelle, Genetic predisposition to colorectal cancer. Nat. Rev. Cancer 4, 769– 780 (2004).

6. L. A. Loeb, C. C. Harris, Advances in chemical carcinogenesis: A historical review and prospective. Cancer Res. 68, 6863–6872 (2008).

7. K. Inaki, E. T. Liu, Structural mutations in cancer: Mechanistic and functional insights. Trends Genet. 28, 550–559 (2012).

8. L. Sansregret, C. Swanton, The role of aneuploidy in cancer evolution. Cold Spring Harb. Perspect. Med. 7, a028373 (2017).

9. A. D. D’Andrea, Susceptibility pathways in Fanconi’s anemia and breast cancer. N. Engl. J. Med. 362, 1909–1919 (2010).

10. H. Kobayashi, S. Ohno, Y. Sasaki, M. Matsuura, Hereditary breast and ovarian cancer susceptibility genes (review). Oncol. Rep. 30, 1019–1029 (2013).

11. C. D. Putnam et al., A genetic network that suppresses genome rearrangements in Saccharomyces cerevisiae and contains defects in cancers. Nat. Commun. 7, 11256 (2016). 12. C. D. Putnam, R. D. Kolodner, Pathways and mechanisms that prevent genome

in-stability in Saccharomyces cerevisiae. Genetics 206, 1187–1225 (2017).

13. C. Chen, R. D. Kolodner, Gross chromosomal rearrangements in Saccharomyces cerevisiae replication and recombination defective mutants. Nat. Genet. 23, 81–85 (1999). 14. J. E. Chan, R. D. Kolodner, A genetic and structural study of genome rearrangements

mediated by high copy repeat Ty1 elements. PLoS Genet. 7, e1002089 (2011). 15. C. D. Putnam, T. K. Hayes, R. D. Kolodner, Specific pathways prevent

duplication-mediated genome rearrangements. Nature 460, 984–989 (2009).

16. P. Kanellis et al., A screen for suppressors of gross chromosomal rearrangements iden-tifies a conserved role for PLP in preventing DNA lesions. PLoS Genet. 3, e134 (2007). 17. K. Myung, A. Datta, R. D. Kolodner, Suppression of spontaneous chromosomal

rearrange-ments by S phase checkpoint functions in Saccharomyces cerevisiae. Cell 104, 397–408 (2001). 18. J. A. Hackett, D. M. Feldser, C. W. Greider, Telomere dysfunction increases mutation

rate and genomic instability. Cell 106, 275–286 (2001).

19. V. Pennaneach, R. D. Kolodner, Stabilization of dicentric translocations through secondary rearrangements mediated by multiple mechanisms in S. cerevisiae. PLoS One 4, e6389 (2009). 20. C. D. Putnam, V. Pennaneach, R. D. Kolodner, Chromosome healing through terminal deletions generated by de novo telomere additions in Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. U.S.A. 101, 13262–13267 (2004).

21. C. D. Putnam, V. Pennaneach, R. D. Kolodner, Saccharomyces cerevisiae as a model system to define the chromosomal instability phenotype. Mol. Cell. Biol. 25, 7226–7238 (2005). 22. C. D. Putnam, K. Pallis, T. K. Hayes, R. D. Kolodner, DNA repair pathway selection

caused by defects in TEL1, SAE2, and de novo telomere addition generates specific chromosomal rearrangement signatures. PLoS Genet. 10, e1004277 (2014). 23. J. E. Chan, R. D. Kolodner, Rapid analysis of Saccharomyces cerevisiae genome rearrangements

by multiplex ligation-dependent probe amplification. PLoS Genet. 8, e1002539 (2012). 24. K. H. Schmidt, J. Wu, R. D. Kolodner, Control of translocations between highly

di-verged genes by Sgs1, the Saccharomyces cerevisiae homolog of the Bloom’s syn-drome protein. Mol. Cell. Biol. 26, 5406–5420 (2006).

25. K. Myung, C. Chen, R. D. Kolodner, Multiple pathways cooperate in the suppression of genome instability in Saccharomyces cerevisiae. Nature 411, 1073–1076 (2001). 26. C. D. Putnam, T. K. Hayes, R. D. Kolodner, Post-replication repair suppresses

duplication-mediated genome instability. PLoS Genet. 6, e1000933 (2010). 27. S. Smith et al., Mutator genes for suppression of gross chromosomal rearrangements

identified by a genome-wide screening in Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. U.S.A. 101, 9039–9044 (2004).

28. P. C. Stirling et al., The complete spectrum of yeast chromosome instability genes identifies candidate CIN cancer genes and functional roles for ASTRA complex com-ponents. PLoS Genet. 7, e1002057 (2011).

29. G. De Piccoli et al., Smc5-Smc6 mediate DNA double-strand-break repair by pro-moting sister-chromatid recombination. Nat. Cell Biol. 8, 1032–1034 (2006). 30. S. Banerjee et al., Mph1p promotes gross chromosomal rearrangement through

partial inhibition of homologous recombination. J. Cell Biol. 181, 1083–1093 (2008). 31. A. Motegi, K. Kuntz, A. Majeed, S. Smith, K. Myung, Regulation of gross chromosomal rearrangements by ubiquitin and SUMO ligases in Saccharomyces cerevisiae. Mol. Cell. Biol. 26, 1424–1433 (2006).

32. K. H. Schmidt, R. D. Kolodner, Suppression of spontaneous genome rearrangements in yeast DNA helicase mutants. Proc. Natl. Acad. Sci. U.S.A. 103, 18196–18201 (2006).

33. M. E. Huang, A. G. Rio, A. Nicolas, R. D. Kolodner, A genomewide screen in Saccha-romyces cerevisiae for genes that suppress the accumulation of mutations. Proc. Natl. Acad. Sci. U.S.A. 100, 11529–11534 (2003).

34. M. E. Huang, R. D. Kolodner, A biological network in Saccharomyces cerevisiae prevents the deleterious effects of endogenous oxidative DNA damage. Mol. Cell 17, 709–720 (2005). 35. C. P. Albuquerque et al., Distinct SUMO ligases cooperate with Esc2 and Slx5 to suppress

duplication-mediated genome rearrangements. PLoS Genet. 9, e1003670 (2013). 36. C. D. Putnam et al., Bioinformatic identification of genes suppressing genome

in-stability. Proc. Natl. Acad. Sci. U.S.A. 109, E3251–E3259 (2012).

37. S. Banerjee, K. Myung, Increased genome instability and telomere length in the elg1-deficient Saccharomyces cerevisiae mutant are regulated by S-phase checkpoints. Eukaryot. Cell 3, 1557–1566 (2004).

38. K. Myung, S. Smith, R. D. Kolodner, Mitotic checkpoint function in the formation of gross chromosomal rearrangements in Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. U.S.A. 101, 15980–15985 (2004).

39. M. E. Budd, C. C. Reis, S. Smith, K. Myung, J. L. Campbell, Evidence suggesting that Pif1 helicase functions in DNA replication with the Dna2 helicase/nuclease and DNA po-lymerase delta. Mol. Cell. Biol. 26, 2490–2500 (2006).

40. J. Y. Hwang et al., Smc5-Smc6 complex suppresses gross chromosomal rearrangements mediated by break-induced replications. DNA Repair (Amst.) 7, 1426–1436 (2008). 41. S. Allen-Soltero, S. L. Martinez, C. D. Putnam, R. D. Kolodner, A saccharomyces

cerevisiae RNase H2 interaction network functions to suppress genome instability. Mol. Cell. Biol. 34, 1521–1534 (2014).

42. A. Colosio, C. Frattini, G. Pellicanò, S. Villa-Hernández, R. Bermejo, Nucleolytic processing of aberrant replication intermediates by an Exo1-Dna2-Sae2 axis counteracts fork collapse-driven chromosome instability. Nucleic Acids Res. 44, 10676–10690 (2016). 43. S. K. Deng, Y. Yin, T. D. Petes, L. S. Symington, Mre11-Sae2 and RPA collaborate to

prevent palindromic gene amplification. Mol. Cell 60, 500–508 (2015).

44. K. A. Shah et al., Role of DNA polymerases in repeat-mediated genome instability. Cell Rep. 2, 1088–1095 (2012).

45. Y. Zhang et al., Genome-wide screen identifies pathways that govern GAA/TTC repeat fragility and expansions in dividing and nondividing yeast cells. Mol. Cell 48, 254–265 (2012). 46. J. N. Weinstein et al.; Cancer Genome Atlas Research Network, The cancer genome

Atlas pan-cancer analysis project. Nat. Genet. 45, 1113–1120 (2013).

47. Z. Li et al., Systematic exploration of essential yeast gene function with temperature-sensitive mutants. Nat. Biotechnol. 29, 361–367 (2011).

48. L. Huo et al., The Rix1 (Ipi1p-2p-3p) complex is a critical determinant of DNA replication licensing independent of their roles in ribosome biogenesis. Cell Cycle 11, 1325–1339 (2012). 49. T. A. Nissan et al., A pre-ribosome with a tadpole-like structure functions in

ATP-dependent maturation of 60S subunits. Mol. Cell 15, 295–301 (2004).

50. S. Tanaka, H. Araki, Helicase activation and establishment of replication forks at chro-mosomal origins of replication. Cold Spring Harb. Perspect. Biol. 5, a010371 (2013). 51. M. A. Basrai, V. E. Velculescu, K. W. Kinzler, P. Hieter, NORF5/HUG1 is a component of

the MEC1-mediated checkpoint response to DNA damage and replication arrest in Saccharomyces cerevisiae. Mol. Cell. Biol. 19, 7041–7049 (1999).

52. A. Srivatsan et al., The Swr1 chromatin-remodeling complex prevents genome instability induced by replication fork progression defects. Nat. Commun. 9, 3680 (2018). 53. E. Rayner et al., A panoply of errors: Polymerase proofreading domain mutations in

cancer. Nat. Rev. Cancer 16, 71–81 (2016).

54. F. J. Lemoine, N. P. Degtyareva, K. Lobachev, T. D. Petes, Chromosomal translocations in yeast induced by low levels of DNA polymerase a model for chromosome fragile sites. Cell 120, 587–598 (2005).

55. K. Jeppsson, T. Kanno, K. Shirahige, C. Sjögren, The maintenance of chromosome structure: Positioning and functioning of SMC complexes. Nat. Rev. Mol. Cell Biol. 15, 601–614 (2014). 56. S. Ben-Aroya et al., Proteasome nuclear activity affects chromosome stability by controlling the turnover of Mms22, a protein important for DNA repair. PLoS Genet. 6, e1000852 (2010). 57. M. Nie, M. N. Boddy, Cooperativity of the SUMO and ubiquitin pathways in genome

stability. Biomolecules 6, 14 (2016).

58. C. B. Gerhold, M. H. Hauer, S. M. Gasser, INO80-C and SWR-C: Guardians of the ge-nome. J. Mol. Biol. 427, 637–651 (2015).

59. S. Loeillet et al., Genetic network interactions among replication, repair and nuclear pore deficiencies in yeast. DNA Repair (Amst.) 4, 459–468 (2005).

60. K. Shimada et al., TORC2 signaling pathway guarantees genome stability in the face of DNA strand breaks. Mol. Cell 51, 829–839 (2013).

61. H. E. Mischo et al., Yeast Sen1 helicase protects the genome from transcription-associated instability. Mol. Cell 41, 21–32 (2011).

62. M. Soriano-Carot, I. Quilis, M. C. Bañó, J. C. Igual, Protein kinase C controls activation of the DNA integrity checkpoint. Nucleic Acids Res. 42, 7084–7095 (2014). 63. T. A. Knijnenburg et al., Genomic and molecular landscape of DNA damage repair

deficiency across the cancer genome Atlas. Cell Rep. 23, 239–254.e6 (2018). 64. M. D. Leiserson et al., Pan-cancer network analysis identifies combinations of rare somatic

Referências

Documentos relacionados

rates in genes in relation to the core genes in the complete genome and codon bias (as in surrogate methods) are taken into account, a specific model of horizontal transfer can

We have applied genetic mapping of quantitative trait loci in the yeast Saccharomyces cerevisiae to identify mutant alleles of genes determining the production of phenylethyl

Bobrowicz, P., et al., Isolation of three contiguous genes, ACR1, ACR2 and ACR3 involved in resistance to arsenic compounds in the yeast Saccharomyces

All genes for a given genome were classified into four groups: essential genes uniquely identified by experimental disruptions (experimental group); essential genes uniquely

Genome wide expression data from human psoriatic skin samples ( n = 58 patients) and normal skin ( n = 64 subjects) was analyzed to identify (A) the 50 genes most strongly increased

The tsa1 mutant was weakly sensitive to H 2 O 2 whereas the tsa2 , ahp1 , prx1 and dot5 single mutants had the same sensitivity as wild- type cells.. The tsa1 tsa2 double mutant

Este artigo propõe uma reflexão sobre a atuação do revisor de textos acadêmico-científicos, buscando contemplar tanto a questão das demandas do mercado de revisão

The genome contained 131 genes in total, which in- cludes 111 single-copy genes corresponding to 77 pro- tein-coding genes, 30 transfer RNA (tRNA) genes and four ribosomal genes