• Nenhum resultado encontrado

Differential RNA-seq, Multi-Network Analysis and Metabolic Regulation Analysis of Kluyveromyces marxianus Reveals a Compartmentalised Response to Xylose.

N/A
N/A
Protected

Academic year: 2017

Share "Differential RNA-seq, Multi-Network Analysis and Metabolic Regulation Analysis of Kluyveromyces marxianus Reveals a Compartmentalised Response to Xylose."

Copied!
31
0
0

Texto

(1)

Differential RNA-seq, Multi-Network Analysis

and Metabolic Regulation Analysis of

Kluyveromyces marxianus

Reveals a

Compartmentalised Response to Xylose

Du Toit W. P. Schabort*, Precious K. Letebele, Laurinda Steyn, Stephanus G. Kilian, James C. du Preez

Department of Microbial, Biochemical and Food Biotechnology, University of the Free State, Bloemfontein, South Africa

*SchabortDWP@ufs.ac.za

Abstract

We investigated the transcriptomic response of a new strain of the yeastKluyveromyces marxianus, in glucose and xylose media using RNA-seq. The data were explored in a num-ber of innovative ways using a variety of networks types, pathway maps, enrichment statis-tics, reporter metabolites and a flux simulation model, revealing different aspects of the genome-scale response in an integrative systems biology manner. The importance of the subcellular localisation in the transcriptomic response is emphasised here, revealing new insights. As was previously reported by others using a rich medium, we show that peroxi-somal fatty acid catabolism was dramatically up-regulated in a defined xylose mineral medium without fatty acids, along with mechanisms to activate fatty acids and transfer prod-ucts ofβ-oxidation to the mitochondria. Notably, we observed a strong up-regulation of the 2-methylcitrate pathway, supporting capacity for odd-chain fatty acid catabolism. Next we asked which pathways would respond to the additional requirement for NADPH for xylose utilisation, and rationalised the unexpected results using simulations with Flux Balance Analysis. On a fundamental level, we investigated the contribution of the hierarchical and metabolic regulation levels to the regulation of metabolic fluxes. Metabolic regulation analy-sis suggested that genetic level regulation plays a major role in regulating metabolic fluxes in adaptation to xylose, even for the high capacity reactions, which is unexpected. In addi-tion, isozyme switching may play an important role in re-routing of metabolic fluxes in sub-cellular compartments inK.marxianus.

Introduction

The yeastKluyveromyces marxianusis emerging as a host for metabolic engineering and recombinant protein production, having a number of advantages overSaccharomyces cerevi-siae. These characteristics include thermotolerance and the ability to utilise a wide variety of sugars, including the pentose xylose, that are abundant in lignocellulosic biomass. Moreover, it

a11111

OPEN ACCESS

Citation:Schabort DTWP, Letebele PK, Steyn L, Kilian SG, du Preez JC (2016) Differential RNA-seq, Multi-Network Analysis and Metabolic Regulation Analysis ofKluyveromyces marxianusReveals a Compartmentalised Response to Xylose. PLoS ONE 11(6): e0156242. doi:10.1371/journal.pone.0156242

Editor:Marie-Joelle Virolle, University Paris South, FRANCE

Received:February 5, 2016

Accepted:May 11, 2016

Published:June 17, 2016

Copyright:© 2016 Schabort et al. This is an open access article distributed under the terms of the

Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability Statement:This Whole Genome Shotgun project has been deposited at DDBJ/ENA/ GenBank under the accession LYPD00000000 and can be accessed under the BioProject PRJNA316809 for the strain. The version of the draft genome described in this paper is version LYPD01000000.

(2)

is probably the fastest growing eukaryote on Earth [1]. Previous studies of this yeast highlighted the large physiological variation amongK.marxianusstrains, in terms of their proneness to fermentation and ability to utilize various substrates, suggesting genetic diversity within the species [2]. Our strain UFS-Y2791 produces a significant amount of ethanol even under aerobic conditions, suggesting that this strain is not typically Crabtree negative. We recently assembled a draft genome of strain UFS-Y2791 from our University of the Free State MIRCEN yeast cul-ture collection. The strain was originally isolated from juice prepared from the arid region suc-culentAgave americana. To enable metabolic engineering ofK.marxianusand other yeasts, it is important to understand genetic responses to environmental factors and different substrates. Such an understanding requires both advanced high-throughput omics methods as well as var-ious integrative computational methods. Thus far, only two studies of genome-scale transcript levels inK.marxianushave been published. Work on strain DMKU 3–1042 suggested that the yeast up-regulatedβ-oxidation in the absence of glucose repression in a complex xylose medium due to the use of lipids as additional carbon source in the absence of glucose, implying possible utilization of the small amounts of lipids present in the rich medium [3]. A question that may also arise here is that in case lipids were indeed present in the rich medium, whether the response in theβ-oxidation was due to glucose de-repression or due to stimulation by lip-ids. Another study focussed on the response of strain Y179 to anaerobic versus micro-aerobic conditions, as well as to the concentration of inulin in the medium. This study focussed on highlighting differential expression in various stress-response genes, those involved in autop-hagy, and a number of individual key enzymes. Both of these recent studies used a complex medium containing peptone and yeast extract [4].

Next-generation sequencing (NGS) has become a popular method not only for sequencing genomes but also for various other experiments [5,6]. RNA-seq is a method in which RNA transcripts are reverse-transcribed and quantitatively sequenced [7]. This method can accu-rately quantify differential expression and the improved sensitivity and dynamic range, as well as the ability to elucidate splice variants, makes it superior to microarray technology. The com-bination of genome sequencing and RNA-seq of a new species or strain is a powerful approach to rapidly gain both a blueprint (genome) and a response (transcriptome) to some perturba-tions [3] or when comparing different species [8]. Innovative methods are now needed to effec-tively use both the blueprint and the response in a concise manner that is attractive to

scientists, since a massive amount of data is generated in a single experiment. It is a major chal-lenge to investigate and represent omics results at the genome scale. Gene set enrichment using Gene Ontology (GO) is an established method for microarrays and is now becoming estab-lished for RNA-seq (reviewed in [9]).

Analysis of metabolic pathways could be a sensible additional approach as it gives a sense of directionality to the response, and the scientist often associates a certain endpoint metabolite with a pathway, providing a means to simplify the understanding of the dataset. Painted ren-derings could be made of individual pathways, but often the number of pathways excludes a concise representation, and the integrated nature of metabolism is lost. Further, in metabolic pathways the interactions (reactions) consist of hypergraphs containing more than one sub-strate or product, complicating the rendering. Cellular overviews, as is available with software such as Pathway Tools, summarises metabolism into a single image [10]. However, the cellular overview method usually assumes a single master framework and inevitably ignores selected highly connected nodes. Cellular compartmentalisation is also often neglected in omics analy-sis due to the added complexity.

An approach that is complementary to the pathway-based understanding of metabolism is that of the study of metabolite levels. Metabolomics, which entails the characterisation and quantification of small compounds in the cell, is still technologically demanding and labour

role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

(3)

intensive, however (reviewed in [11,12]). A new approach in systems biology is to derive com-pounds that are likely differentially expressed or extensively homeostatically regulated, from a differential transcriptomics dataset. These are the reporter metabolites [13] and the method could be generalised and applied in many scenarios such as described elsewhere [14].

Models of biochemical pathways have much potential to reveal new insights that are not obvious from the exploration of datasets. The understanding of complex biochemical datasets using computational models is called systems biology. Some flux modelling approaches such as Flux Balance Analysis (FBA) simulations could be done even at the genome scale [15,16], as was recently reported forK.lactis[17]. A long-standing fundamental question is how the flux through a metabolic pathway is regulated. Is the change in a given flux achieved by changes in the concentrations of metabolites that affect that enzyme (metabolic level), or through changes in gene expression or post-translational modifications (hierarchical level)? By combining flux models with measured metabolite exchange fluxes, or through more complex13C-Metabolic Flux Analysis, estimation of fluxes at different physiological states can be very informative. Metabolic Regulation Analysis (MRA) combines differential expression levels with differential flux estimates to reveal for each flux the contribution of the metabolic and hierarchical levels of regulation [18].

Here we report on a detailed RNA-seq transcriptomics study to explore the response ofK.

marxianusUFS-2791 to glucose and xylose in a chemically defined medium under aerobic con-ditions, including a number of different systemic analyses. Although a major potential future application ofK.marxianusmay be ethanol production from lignocellulosic biomass, which is an anaerobic or oxygen limited process where both glucose and xylose may be present, we chose to perform aerobic cultivations with glucose or xylose separately to remove any con-founding effects. The strain also utilizes these sugars sequentially. The high cultivation temper-ature of 35°C was also chosen as this strain was determined to have a growth optimum close to 35°C. Our samples were also taken during mid-exponential phase to eliminate the effect of any possible ethanol stress that may occur later during the fermentation. To address the need for effective integrative exploration of omics data, including RNA-seq, and to combine these data with modelling and simulation, we developed the Reactomica software. We combined gene set enrichment of Gene Ontology (GO), reporter metabolites, metabolic pathway maps with a strong focus on subcellular compartmentalisation, and two new approaches of representation, namely pathway-to-pathway networks and reporter metabolite-enzyme networks.

The aims were to first effectively explore the key features of the differential response to xylose in a defined medium under aerobic conditions and determine whether the peroxisomal lipid catabolic response previously observed [3] was limited to a rich medium. Subsequently, the central carbon metabolism was investigated in detail, separating the response into subcellu-lar compartments. It is known that in most yeasts that are able to utilize xylose, the two-step conversion via NADPH dependent xylose reductase and the NAD+dependent xylitol dehydro-genase is present and that the co-factor independent isomerase reaction is absent, as inK.

(4)

Materials and Methods

Genome sequencing and annotation

The genome ofK.marxianusUFS-Y-2791 was sequenced on the Illumina HiSeq platform to 100-fold coverage at the Onderstepoort Biotechnology Platform, Pretoria, South Africa. Assembly was performedde novousing Abyss. Open reading frames were found by Augustus. Putative protein sequences were annotated using one of two methods. For annotation of enzymes (778 genes), creation of a pathway genome database (PGDB) and all subsequent anal-ysis involving metabolic pathways, protein sequences were annotated against the Kegg-Kaas server with default settings on the server, resulting in a list of EC number annotations. Output was subsequently parsed and converted to input for the PathoLogic algorithm of Pathway Tools. For gene set enrichment using Gene Ontology (GO), sequences were additionally anno-tated against the UniProtKB database on a high-performance computing cluster, using BLASTP. An E-value cut-off of 1E-10 was used for gene set enrichment as was done by others [3]. An additional 73 genes with E-values between 1E-5 and 1E-10 were included in the list of annotated genes and flagged for further annotation. We preferred the rich manually curated SwissProt annotations over automated Trembl annotations, resulting in 68.3% of annotations fromS.cerevisiae, 17.4% fromK.lactis, 3.2% and 2.8% from two strains ofK.marxianus, and the rest from other species. The draft genome, predicted open reading frames, predicted pro-tein sequences and functional annotations are made available in the Supplementary materials (S1 Draft Genome,S1 ORF,S1 Proteins,S1 Table).

Strains and cultivation

All chemicals and fermentation media used in cultivations were obtained from Sigma Aldrich, Seelze, Germany.K.marxianusUFS-Y2791 from our University of the Free State MIRCEN yeast culture collection was maintained on YPD agar slants at 4°C. Cultivation was carried out under aerobic conditions in 500 ml shake flasks with 30 ml working volume at 180 rpm. We chose the shake flask format to allow expensive13C-isotopic tracer studies which would be cost prohibitive in bioreactors. All cultivations were carried out at 35°C. The pre-inoculum was incubated in YPD medium for 8 h. The inoculum was prepared by dilution of the pre-inocu-lum to an OD690of 0.05 in a chemically defined medium and grown at 35°C for 16 h. The

defined medium contained (g l-1): glucose or xylose, 5; (NH4)2SO4, 2.5; MgSO47H2O, 0.5;

CaCl2H2O, 0.03; NaCl, 0.1; citric acid, 0.25 and KH2PO4, 10. Filter-sterilised vitamins were

added to the autoclaved medium at the following concentrations (mg l-1): biotin, 0.025; calcium pantothenate, 0.5; nicotinic acid, 0.5;p-aminobenzoic acid, 0.1; pyridoxine HCl, 0.5; thiamine HCl, 0.5 andmyo-inositol, 12.5. Trace elements were added according to du Preez and van der Walt [19]. The pH of the medium was adjusted to 5.5. Glucose and xylose were quantified using HPLC. Acetate, ethanol and glycerol were the only fermentation metabolites secreted in significant amounts and were quantified using HPLC. All samples were taken before an OD600

of 0.8 was reached.

RNA-seq data generation

Cells from duplicate cultivations were harvested in mid-exponential phase at an OD690of 0.8

(5)

3.75 million trimmed paired-end reads were mapped to the UFS-Y2791 genome. Processing of sequencing data was carried out in Galaxy. Reads were trimmed using Trimmomatic [20] and mapped to the genome using TopHat, while CuffDiff [20] was used to test for differential expression. CuffDiff reports p-values as the statistical significance and well as the q-value, which is the p-value after accounting for multiple comparisons [21]. Genes were only consid-ered to be significantly differentially expressed when q-values were below 0.05.

Gene set enrichment and reporter metabolites

For gene set enrichment and all other network constructions and renderings, programs were developed as part of a new software suite for integrative systems biology that we call Reacto-mica, implemented in the Wolfram Mathematica language. Gene set enrichment scores for GO terms and pathways were calculated similar to that described by Ideker [22] as follows: GO ontologies goslim_yeast.obo and go-basic.obo.txt.m were obtained from the Gene Ontology database [23]. The ontologies were converted to graphs using the‘is_a’child-parent mappings. For obtaining the gene set, a depth-first scan was performed from each GO term and all addi-tional GO terms found from the node of interest were collected to ensure that highly specialised terms would find utility in GO_slim. Mappings from the annotated genes to the GO-terms in the GO‘biological_process’,‘molecular_function’and‘cellular_component’attributes of a UniProt entry were used to map from GO terms to genes. Significance q-values could be con-verted to Z-scores as the negative of the inverse cumulative distribution function and then summed over all genes in the gene set to give the representative statistic for a group of genes as the total Z-score [22]. Random gene sets were generated with bootstrapping (1000 iterations) and total Z-scores calculated. The mean and standard deviation at a variety of gene set sizes were calculated and used to calculate the enrichment score S:

S¼Zðtotal;TestÞ MeanðZ;BackgroundÞ

Standard deviationðZ;BackgroundÞ ð1Þ

For pathway gene set analysis, MetaCyc pathways were used from the BioCyc pathway genome database constructed for this strain using Pathway Tools, which is based on the Meta-Cyc database. Reporter metabolite enrichment scores were calculated in the same manner as described above and according to Patil and Nielsen [13]. For reporter metabolites, the back-ground mean and standard deviation of random genes sets were rather generated by sampling enzyme-encoding genes only, as opposed to the complete gene set, since a large fraction of the differentially expressed genes were metabolic, generating a higher random background enrichment.

Pathway maps

To explore the metabolic response, metabolic pathway maps were created from MetaCyc path-ways and RNA-seq data were mapped using various colouring schemes with Reactomica, har-nessing automated hypergraph map layout and manual override. The initial linkage between genes and reactions were made by the Pathway Tools algorithm PathoLogic, with genomic annotations from the Kegg-Kaas annotation server. For the Log2(fold change) colouring

(6)

Pathway-to-pathway networks

Pathways were clustered by the number of metabolites in common to generate a scoring matrix. The number of metabolites in common was normalised by the smaller of the two metabolite sets of a pathway and a threshold was selected for including a mapping that resulted in optimal rendering. Orphaned pathways were clustered together.

Molecular networks

Molecules were clustered by the simple similarity criterion of string matching of SMILES strings of each compound in the PGDB using the edit distance. Only the closest match was included as a mapping. The method is similar to that done by Barupal et al. [24].

Reporter metabolite-enzyme networks

The metabolic network was converted from a hypergraph into a graph in which only the inter-actions between differentially expressed enzyme genes and enriched reported metabolites were retained.

Differential Flux Analysis

The aim was to approximate the fluxes on glucose and xylose, sufficient for approximation of MRA values. FBA was used which included optimisation of biomass formation, and measured specific sugar consumption rate and specific ethanol, acetate and were included to constrain the flux solution. The model was constructed in Reactomica using reactions from MetaCyc for which representative genes were found in the genome annotation ofK.marxianusUFS-Y2791. Some additional reactions were defined such as the combined electron transport and oxidative phosphorylation reaction. The biomass reaction was adapted from Fischer et al. [25]. The flux model and parameters is provided inS5 Table. FBA was implemented in Reactomica and the method of FBA was reviewed elsewhere [15]. FBA uses linear optimisation, maximising the biomass formation reaction, constrained by the reaction stoichiometry, reversibility con-straints, uptake rate of nutrients (glucose or xylose) and production rates (ethanol and acetate).

Metabolic Regulation Analysis

The metabolic regulation of a flux can be separated into a metabolic componentρm and a

hier-archical componentρh. The two levels of regulation are combined in the relation below [18].

1¼rhþrm ð2Þ

The metabolic componentρm models the contribution to regulation that changes in

metab-olite concentrations make and is described by

rm¼

X

X

d lnðvÞ

d lnðXÞ:

d lnðXÞ

d lnðJÞ ð3Þ

whereJis theflux though that enzyme, modelled by the rate equationv, and affected by changes in the concentration of metaboliteX. The hierarchical componentρh models the effect

of changes in maximal activity of the enzyme, which is usually linearly dependent on the enzyme concentrationeand described by:

rh¼d lnðeÞ

(7)

The hierarchical component could thus be experimentally determined by measuring a dif-ference in maximal enzyme activity. By usingEq 2and substituting with the experimentally determinedρh,ρmcould also be calculated. Though maximal activity is affected not only by

the protein concentration and post-translational modifications, it is reasonable and practical to use changes in transcript levels obtained from RNA-seq instead of maximal activities or protein levels for an estimated MRA. For MRA, gene expression changes were considered significant if at least one gene (among paralogs) was considered significant from q-values. Transcript fold changes were calculated from total transcript abundances for all genes mapping to a reaction, in case of isozymes resulting from paralogs or multi-functional proteins. In this manner, the method is robust to potential errors in annotation of paralogs since central metabolic genes are known to have on average higher transcript abundance in comparison with the vast majority of other genes. The fold changes of individual minor isozymes with very low expression levels also cannot dominate the analysis.

Results

In glucose medium, respiro-fermentative growth was observed, even though fully aerobic con-ditions were ensured by the small working volume in a large flask, vigorous shaking, and sam-pling at low OD600values, with ethanol, glycerol and acetate as fermentation products. In

xylose medium, fermentation products were absent, with an apparently purely respiratory metabolism. The maximum specific growth rate in glucose medium was approximately 0.8 h-1, while the maximum specific growth rate in xylose medium was approximately 0.35 h-1. All graphs and other renderings were generated using a new software suite for bioinformatics and integrated systems biology, termed Reactomica, which was developed in-house. Interactive datasets are provided in Computable Document Format (.CDF) as supplementary materials and are viewable using the free Wolfram CDF player, which can be downloaded from the Wol-fram website [26], or by using Mathematica. High quality differential RNA-seq datasets were generated on the Illumina HiSeq platform to a high read depth. Figs1and2show the distribu-tion of the data. Throughout,“up-regulated”refers to genes statistically up-regulated in a xylose medium compared to the condition with glucose as the carbon source, with a q-value below 0.05, as reported by CuffDiff (see‘Materials and methods‘). Out of the total of 4 953 putative genes analysed, 329 were up-regulated and 251 down-regulated. Supplementary fileS1 Tableprovides all expression values, differential expression statistics and UniProt annotations of all genes.

Fig 1. Volcano plot of RNA-seq data.FC: fold change.–q: the corrected p- values after taking multiple comparisons into account, as performed by CuffDiff. Red: up-regulated on xylose. Blue: down-regulated on xylose. Black: constitutively expressed. Statistical procedures were performed in CuffDiff.

(8)

The role of rich medium versus defined medium in the xylose response

We compared results from reference [3], which reported 36 genes in strain DMKU 3–1042 that were up-regulated only in the aerobic xylose-containing rich medium and not in any of the other three conditions tested (seeS1 Table) (we refer to this gene set as the xylose-present/ glucose absent unique gene set). Differential expression of 13 genes were correlated with the reference data (up-regulated in UFS-2791) and should be specific to the xylose response (or the absence of glucose), independent from the background medium, and present across strains. Notably, POT1, POX1 and FOX2 of peroxisomalβ-oxidation were dramatically up-regulated in both datasets. Carnitine O-acetyltransferase (CAT2), associated with inter-compartmental transport of fatty acids, was also strongly up-regulated in both sets, supporting increased capac-ity for lipid catabolism (see a detaileddiscussionlater). ICL2 and CIT3 of the 2-methylcitrate cycle was also strongly up-regulated in both experiments (see a detaileddiscussionlater). Alde-hyde dehydrogenase (ALD4), glucokinase 1 (GLK1) and dicarboxylic amino acid permease (DIP5) are the other enzymes functioning in central metabolism. Two regulatory proteins Ty transcription activator (TEC1) and the G2/mitotic-specific cyclin-4 (CLB4) were also among these. The rest of the genes in this list are not well-characterised. These include stationary phase protein 4 (SPG4), putative metabolite transport protein ywtG and uncharacterized mem-brane protein YMR155W.

Thirteen genes were uncorrelated with the reference data (constitutive in UFS-2791) (YPR011C, HSP12, YHL008C, POP6, NCE103, YLL032C, NCS2, GAS1, TOS1, PDR5, MYO1, EIS1, YDR134C) and are thus specific to either the strain or the rich medium. Most of these are uncharacterised proteins.

Finally, two genes were anti-correlated with the reference data (down-regulated in UFS-2791). These are the ammonia transport outward protein 3 (ATO3) and dihydroxyacetone kinase 1 (DAK1). The exact functions of ATO1, ATO2 and ATO3 are not currently known, apart from their possible involvement in mitochondrial retrograde signalling and ammonia production during starvation. It seems that ATO3 requires rich medium containing amino acids to be up-regulated, as was also observed inS.cerevisiae[27,28]. In summary, several genes were highlighted as up-regulated both in our data and the xylose-present/glucose absent unique gene set from reference [3], in particular, the peroxisomal beta-oxidation and the sup-porting fatty acyl transporters. The roles of a number of other genes are unclear from this pre-liminary analysis, however.

(9)

Gene Ontology

A total of 1 611 GO terms were included in the UniProt annotations of all proteins in the genome (seeMaterials and methods). To obtain an initial overview, the concise‘GO_slim’

yeast from the Gene Ontology Consortium was used. GO term enrichment was performed sep-arately for‘cellular_component’,‘molecular_function’and‘biological_process’components of

‘GO_slim’and the enrichment scores were used to render maps (see‘Materials and methods‘). There is currently no generally accepted cutoff for interpreting significance of an enrichment value [14]. For the‘cellular_component’ontology, assuming a score cut-off at 1.64 (p = 0.05), five terms were considered to be significantly enriched (Fig 3andS2 Table). These are‘ extracel-lular region’,‘peroxisome’,‘plasma membrane’,‘membrane’,‘mitochondrion’and‘cell wall’.

‘Plasma membrane’is a subset of‘membrane’in the GO-slim ontology. It is striking that the peroxisomal genes were up-regulated (14 out of 32 genes) and only one down-regulated. The

‘extracellular region’exhibited mostly up-regulation (22 out of 80 genes) with seven ulated genes. Of the 269 plasma membrane genes, 31 were up-regulated and 22 were down-reg-ulated. In the mitochondrion the number of up and down-regulated genes were approximately equal (29 and 23, respectively out of 308 genes). The cell wall genes showed also mostly up-reg-ulation, with 17 out of 80 genes up-regulated and four down-regulated.

In the‘molecular_function’ontology, only three terms are regarded as significant.‘ Oxidore-ductase activity’,‘kinase activity’and‘peptidase activity’(Fig 4andS3 Table).‘Oxidoreductase activity’is the most highly enriched term in the‘GO_slim’gene sets (enrichment score = 4.49) and also in the complete GO enrichment (score = 13.2). It is notable that the redox balance, which is regulated by oxidoreductases, is one of the key considerations in the ability to utilize pentoses by yeasts, as it usually involves a requirement for NADPH and the additional genera-tion of NADH, as should also be the case withK.marxianussince it does not possess a xylose isomerase.

The‘biological_process’ontology is rich in providing a framework for exploring the regula-tion of cellular processes in response to different carbon sources. The‘carbohydrate transport’

gene set was the most significantly altered, with 21 out of 65 transporter genes up-regulated and 7 down-regulated (Fig 5andS4 Table). The‘transmembrane transport’gene set displayed a more equal up vs down-regulation, whereas‘peroxisomal organization’genes had exclusively up-regulated genes. The latter feature is shared with the‘peroxisome’gene set in the‘ cellular_-component’ontology, indicative of a general up-regulation of peroxisomal components and

Fig 3. Gene set enrichment map of RNA-seq data using the‘GO_slim’yeast‘cellular_component’

ontology.Size indicates the enrichment score of a gene set. Colour indicates the up/down direction of regulation: Red, up; blue, down. Brightness rises with the fraction of genes that are regulated in the dominant direction (up or down). Only nodes larger or equal than‘cell wall’are significant.

(10)

activities. The‘amino acid transport’gene set also indicated differential expression. The gene set of‘ion transport’was also enriched, but is closely related to‘amino acid transport’. Also sig-nificant is‘carbohydrate metabolic process’, which is better interpreted in terms of metabolic pathways. Although‘histone modification’and‘proteolysis involved in cellular protein cata-bolic process’both have significant scores, they only have one and four genes, respectively, in the gene sets, and they were therefore not interpreted for further investigation.

In summary, peroxisomal organisation and metabolism were clearly and strongly up-regu-lated on xylose, consistent with a previous observation [3] that at least fatty acidβ-oxidation was up-regulated in a complex medium that likely contained small amounts of lipids, which were absent in the present study where a chemically defined medium was used. Secondly, a strong effect was seen on carbohydrate transporters, which was consistent with the experimen-tal setting. Some of these differentially expressed putative sugar transporter genes, such as the variousHGT1homologs ofK.lactis, were up-regulated as high as 1242-fold (seeS4 Table). Some of these may encode for xylose transporters. As sugar transporters are highly similar, additional annotation of this group is required before more conclusions regarding sugar trans-port may be drawn. Thirdly, the extracellular region was affected. These are proteins that are secreted into the medium, such as lytic enzymes or receptors that sense a new environment, as well as the mating genes. Some of these gene sets are explored in more detail in the supplemen-tary textS1 Text.

It is interesting to note that although many genes in the mitochondrion were differentially regulated, these genes are not the main enzymes of the TCA cycle, electron transport or oxida-tive phosphorylation. Thus, there is no response suggesting adaptation to a change in the inter-nal energy charge. As the maximum specific growth rate in xylose medium was less than half

Fig 4. Gene set enrichment map of RNA-seq data using the‘GO_slim’yeast‘molecular_function’

ontology.Size indicates the enrichment score of a gene set. Colour indicates the up/down direction of regulation: Red, up; blue, down. Brightness rises with the fraction of genes that are regulated in the dominant direction (up or down). Only nodes larger or equal than‘peptidase activity’are significant.

(11)

of that in glucose medium, whereas the cell sizes were comparable, one might expect differen-tial regulation of cell cycle progression genes, or even biomass formation-specific genes like those encoding for ribosomal proteins. There was a notable absence of such gene sets in the enrichment analysis. This is also reasonable since cell cycle control signalling proteins are mostly kinases, which are controlled by post-translational modifications. Most genetic

responses observed in the data thus deal with utilisation of alternative substrates. Further, there was only a weak response in the oxidative stress genes, as was expected from aerobic cultiva-tion. From the genes annotated with‘response to oxidative stress’(PRDX5, CTT1, CCP1, YFH1, DCS1, HMX1, SVF1, YDR222W), only peroxiredoxin-5 (PRDX5, mitochondrion, cyto-plasm, peroxisome, 3.9-fold up-regulated) and catalase T (CTT1, cytocyto-plasm, 2.4-fold up-regu-lated) were moderately differentially regulated, resulting in an enrichment score of 1.19, which was insignificant against the background enrichment.

The data indicated several interesting aspects regarding sugar utilisation, including up-regu-lation of pathways for utilisation of galactose, xylose and arabinose (see S5). Several enzymes withβ-glucosidase activity were found in the annotation (seeS1 Text), whereas only one enzyme was both extracellular and significantly up-regulated (48.8-fold), namely cellobiase (EC 3.2.1.21). Cellobiase is not a typical cellulase that can depolymerise cellulose, as it functions to hydrolyse the disaccharide cellobiose.K.marxianusUFS-Y2791 also does not possess typical secreted xylanases, proteases, peptidases or lipases, which would allow a microorganism to thrive on plant matter or even attack a live plant. Instead, the strain possesses the inulinase geneINU1which hydrolyses the fructan inulin or the disaccharide sucrose. TheINU1gene was dramatically up-regulated 91-fold and abundantly expressed in xylose medium.

Fig 5. Gene set enrichment map of RNA-seq data using the‘GO_slim’yeast‘biological_process’

ontology.Size indicates the enrichment score of a gene set. Colour indicates the up/down direction of regulation: Red, up; blue, down. Brightness rises with the fraction of genes that are regulated in the dominant direction (up or down). Only nodes larger or equal than‘proteolysis involved in cellular protein catabolic process’are significant.

(12)

The pheromone signalling system involved in sexual reproduction, as well as several other genes involved in the conjugation process as well as in invasive growth, were also up-regulated (see‘extracellular region’inS2 Table). In the presence of xylose (or absence of glucose) there is thus a response to physiologically adapt to an invasive lifestyle and utilise other sources of nutrients (xylose, arabinose, inulin, cellobiose and amino acids), as would be found in the natu-ral plant environment. A long-term survival strategy (sexual reproduction) is also activated in the more nutrient-poor condition.

Most metabolic pathways for amino acid synthesis were down-regulated, as was expected at the lower growth rate;μmaxvalues of 0.8 and 0.35 were recorded on glucose and xylose,

respec-tively. The well-known importers of ammonia were constitutive (MEP3) or down-regulated (MEP2, 2.8-fold).

Global metabolic response elucidated by pathway-to-pathway networks

To capture the global metabolic response to growth on xylose in a single view, an innovative pathway-to-pathway network was constructed by clustering pathways together by their com-mon metabolites (Fig 6,S1 NetworkandS1 Table). The wide down-regulatory profile of amino acid metabolic pathways is visible, where all amino acid biosynthetic pathways were shut down, with only glycine biosynthesis III having some element of up-regulation. Similarly, most amino acid catabolic pathways were shut down with only glutamate, valine, tyrosine, phenylal-anine, tryptophan and methionine degradation III having some up-regulated genes.

Fatty acid oxidation, peroxisomal ethanol degradation and mixed acid fermentation, which cluster together, were mostly up-regulated. Close by in the network is glycolysis, which was down-regulated. A moderately increased capacity in the glycerol-3-phosphate shuttle was evi-dent here, with the mitochondrial glycerol-3-phosphate dehydrogenaseGUT2gene 4.4-fold up-regulated. Note that the up-regulated glycerol-3-phosphate shuttle functions independently from down-regulated glycerol production.

Reporter metabolites

Reporter metabolite enrichment was performed to reveal those metabolites around which sig-nificant differential expression of enzymes took place. These compounds could be interpreted as those from which there was(a)a marked change in capacity for their utilisation or produc-tion between condiproduc-tions,(b)to have required a significant degree of regulation by their neigh-bouring enzymes to establish homeostasis in the different environment, or(c)to have a different concentration predicted between conditions where it may be a candidate as a signal-ling molecule. The top eight reporter metabolites of highest enrichment were 3-phosphoglycer-ate, acetaldehyde, NADH, glyceraldehyde-3-phosph3-phosphoglycer-ate,β-alanine, NAD+, glycerate and ethanol, in this order (S1 Table). NADH, NAD+, acetaldehyde and ethanol are involved in redox metabolism and thus support the high gene set enrichment score of the GO term‘ oxido-reductase activity’(GO:0016491). It was interesting to note that the enrichment score of ATP was among the lowest against the background and considered insignificant. Also, glutathione and its reduced form, which are known to be involved in redox metabolism, were not signifi-cantly enriched, suggesting no severe oxidative stress. Oxidative stress inK.marxianusis seem-ingly more important under oxygen limiting conditions, as imposed by static and high-temperature conditions [3].

Molecular networks

(13)

molecular structure similarity matching protocol. This approach could identify some co-regu-lated groups of structurally reco-regu-lated molecules that were not evident from pathways-based anal-yses.Fig 7shows a number of clusters of enriched reporter metabolites grouped by their molecular structures (see interactive fileS2 Networkfor molecular structures and annotations).

Groupings of CoA-conjugates, long-chain and short-chain fatty acids are representative of increased catabolic activity ofβ-oxidation and activation steps. Up-regulated sugar clusters corresponded to those found in reporter metabolite-enzyme networks, but were represented in

Fig 6. Gene set enrichment map of RNA-seq data using the pathway-to-pathway network.Size indicates the enrichment score of the gene set representing each pathway. Colour indicates the up/down direction of regulation: Red, up; blue, down. Brightness indicates uni-directionality of regulation. The‘Self’cluster on bottom-left includes pathways not mapped to other pathways since their intersections scores were below a threshold.

(14)

a clearer fashion. Noteworthy is that effectors of trehalose containing sugars and sugar lipids were down-regulated. The three-carbon molecules in central carbon metabolism, which are close together in the network including 3-phosphoglycerate, 2-phosphoglycerate, glyceralde-hyde-3-phosphate and glyceraldehyde, were strongly enriched and their effectors down-regu-lated. This mapping is useful as a concise representation of potentially all metabolites in a cell,

Fig 7. A molecular network of all reporter metabolites enriched at or above enrichment score>= 1.64 (p = 0.05).Mapping between compounds was performed with a local string matching procedure based on the SMILES string of each compound. Scores were normalised to the string size for the longest of the two molecules in a pairwise comparison. Edge weights represent normalised similarity scores. The“Self”node maps all compounds with an insufficient normalised similarity score to other compounds (<0.4). Size indicates the enrichment score of a gene set. Colour indicates the up/down direction of regulation: Red, up; blue, down. Brightness indicates uni-directionality of regulation.

(15)

grouped by their structures. The method would especially be useful for visualising metabolo-mics datasets of which the link to reactions and pathways is not yet clear. A better separation of clusters is obtained in mapping larger metabolites as found in secondary metabolism.

Key enzymes that may affect metabolite pools

Combining both the reporter metabolites and the enzymes by which they are regulated into an enzyme-reporter metabolite network, effectively reveals the hotspots of metabolic regulation as well as the key players in regulation. Including all interactions with enriched compounds (enrichment score>1.64, q<0.05) was not feasible for a detailed investigation (1352

interac-tions). Instead, we extracted the interactions containing nodes with enrichment scores above 3.0 (Fig 8, see interactive fileS3 Network).

A major network is evident in which NAD+/NADH, oxygen and the aldehyde dehydroge-nase have a strong involvement. A second subnetwork involves sugar metabolism and glycosyl-ation. A third revolves around one-carbon metabolism. The subnetwork of NAD+(enrichment score = 4.2) was extracted (Fig 9, left, see interactive fileS4 Network), which reveals a number of genes involved with biosynthesis being down-regulated. In addition, a large contribution to the enrichment score is made by the aldehyde dehydrogenasesALD4,ALD5andALD6, by the alcohol dehydrogenases annotated asADH2andald, and by sorbitol dehydrogenaseSOR1.

SinceK.marxianusrelies on an NADPH dependent xylose reductase, it was expected that NADPH would be enriched (enrichment score = 2.6). The enzymes directly affecting NADPH are explored inFig 9(right, see interactive fileS5 Network). We anticipated up-regulation of major NADPH producing enzymes to supply reducing power for xylose utilisation by xylose reductase. Instead, only three enzymes directly involved with NADP(H) were up-regulated.

IDP1(mitochondrial NADP-specific isocitrate dehydrogenase) was only moderately up-regu-lated (2.2-fold). A more significantly up-reguup-regu-lated enzyme, mitochondrial quinone oxidore-ductase (QOR, 9.9-fold up-regulated) reduces 1,4-benzoquinone, as was shown by cloning this gene fromK.marxianusinE.coli[29]. It is not part of the major catabolic processes in central metabolism, however. Another isYPR127W, a putative pyridoxal reductase that functions to degrade vitamin B6 and involved with multidrug resistance and not central metabolism [30]. In the oxidative pentose phosphate pathway, which is assumed to be the main generator of NADPH in most species,GND1(6-phosphogluconate dehydrogenase) andSOL1 (6-phospho-gluconolactonase) were, however, constitutively expressed. Another enzyme,SPS19 (peroxi-somal 2,4-dienoyl-CoA reductase), was strongly up-regulated at 269-fold. It functions to reduce double bonds to facilitateβ-oxidation of unsaturated fatty acids in the peroxisome [31], another indication that peroxisomal metabolism was strongly differentially regulated.

Central carbon metabolism

To explore the metabolic response of the central carbon metabolism, metabolic pathway maps were created from MetaCyc pathways and RNA-seq data were mapped using various colouring schemes. Figs10and11show total transcript abundances mapped to reactions (see alsoS1 Pathway). It is evident that on glucose, the central glycolytic route from glucose to ethanol is highly expressed. In xylose medium, transcript abundance is less pronounced in glycolysis and ethanol production, whereas PPP, the pyruvate dehydrogenase bypass andβ-oxidation enzymes increase in transcript abundance.

An interesting differential expression pattern is evident inFig 12. Consistent with the exper-imental setting, the NADPH-dependent D-xylose reductase geneXYL1was drastically up-reg-ulated on xylose (48.8-fold) with very high gene expression levels on xylose. Xylitol

(16)

sorbitol dehydrogenaseSOR1can act as a xylitol dehydrogenase inS.cerevisiae[32]. Significant up-regulation ofSOR1by 208-fold supports this function. XylulokinaseXKS1was also

11.3-fold up-regulated. TransaldolaseTAL1of the non-oxidative pentose phosphate pathway (PPP) and ribose-5-phosphate isomeraseRKI1were moderately up-regulated (4.8 and

Fig 8. Reporter metabolite-enzyme network, capturing enriched reporter metabolites and the enzymes that affect them.Only reporter

metabolites with enrichment values above 3.0 were included and only the differentially expressed genes from RNA-seq with q-values (corrected p-values) below 0.05. For reporter metabolites (circles), size indicates the enrichment score of a gene set and colour indicates the up/down direction of regulation. Brightness indicates uni-directionality of regulation. For enzymes (stars), colour indicates the up/down direction of regulation based on the log2(fold change) scheme. Reporter metabolite enrichment values are represented inS1 Table. (For full information on gene names, the interactive fileS3 Network

or the corresponding annotations inS1 Table, using the“Gene names (primary)”column, may be consulted.)

(17)

2.6-fold). We did, however, not observe up-regulation of any enzymes in the oxidative branch to support additional NAPDH production for xylose utilisation, or to combat oxidative stress by charging of the glutathione system. We found a remarkably clear down-regulation of glyco-lytic genes on xylose with a moderate fold change. The gluconeogenesis-associated gene fruc-tose-1,6-bisphosphataseFBP1was also sharply up-regulated by 27-fold, which was also observed in a microarray experiment of a xylose utilising recombinantS.cerevisiaestrain [33]. β-Oxidation reactions and fatty acid activation reactions were clearly up-regulated.

Further, there was also a strong apparent down-regulatory effect on alcohol dehydrogenases and aldehyde dehydrogenases, and both the citrate synthase of the TCA cycle and the isocitrate lyase of the glyoxylate cycle were seemingly dramatically up-regulated. The latter two observa-tions in this analysis, however, are misguided by the lack of compartmentalisation of the response. It should be noted that many metabolic reactions could be catalysed by more than one enzyme, where for some reactions like those catalysed by alcohol dehydrogenase (ADH), there are at least five gene products that could potentially catalyse the same reaction.Fig 12

used only the most extremely altered differentially expressed gene for rendering.Fig 13 pro-vides a different perspective, however, where reactions that have more than one associated gene and which were regulated in opposite directions are identified. These are isocitrate lyase, aldehyde dehydrogenase and alcohol dehydrogenase.

Subsequently, the GO‘cellular_component’terms from the UniProt_SwissProt_fungi were used to render compartmentalised maps reconstructed from MetaCyc pathways.Fig 14shows that in the cytosol xylose utilisation reactions were up-regulated, whereas glycolysis, together

Fig 9. Enzyme-metabolite interaction network around redox cofactors.Left: NADH; Right: NADP+. For reporter metabolites (circles), size indicates the enrichment score of a gene set and colour indicates the up/down direction of regulation. Brightness indicates uni-directionality of regulation. For enzymes (stars), colour indicates the up/down direction of regulation based on the log2(fold change) scheme. (For full information on gene names, the interactive file

S4 Networkor the corresponding annotations inS1 Table, using the“Gene names (primary)”column, may be consulted.)

(18)

with pyruvate decarboxylase, NAD+-specific acetaldehyde dehydrogenase (ALD) and glycerol production were down-regulated. Further, a number of reactions usually associated with the TCA cycle were also present in the cytosol and were constitutively expressed. More than one type of ADH with opposite regulatory direction is present in the cytosol. In the peroxisomal compartment (seeS1 Pathway),β-oxidation of lipids, which inS.cerevisiaeis performed exclu-sively in the peroxisomes [34], is clearly visible with only the 3-hydroxyacyl-CoA dehydroge-nase (OHACYL-DEHYDROG-RXN) gene missing from the annotation. In mitochondria (see

S1 Pathway) the reactions catalysed by the ALDs (ALD4,ALD5,ALD6) and ADHs were still isozyme switching reactions. BothALD4(up-regulated 83-fold) andALD5(down-regulated

Fig 10. Total transcript levels in central metabolic pathways with glucose as the carbon source.Transcript levels for all genes catalysing a reaction were summed. Note the logarithmic scale. Dark grey indicates genes present and constitutively expressed. Light grey indicates genes not found in annotation or combined reactions. For reaction names, seeS1 Pathway.

(19)

167-fold) were strongly differentially expressed. Several ADH genes were found inK. marxia-nus. Two of these were annotated asADH3andADH4, both mitochondrial, whereas five (ADH1,ADH2,ADH6,SFA1andadh) were taken to be cytoplasmic. As previously reported [35], ADH2 was only expressed significantly in the presence of glucose, and was in fact the most significantly down-regulated gene in our dataset (229-fold down-regulated). This corre-sponds to the regulation of the ADH2 orthologs inS.cerevisiaeandK.lactis[35]. Conversely, ADH1 was constitutively expressed, again as previously reported [35], which differs from the regulation inS.cerevisiaeandK.lactiswhere glucose stimulated ADH1 expression. Transcrip-tional rewiring of ADH isozymes is thus present inK.marxianuscompared to its relatives.

In the mitochondrial map, the NADPH specific isocitrate dehydrogenase gene was regu-lated, whereas citrate synthase of the TCA cycle and isocitrate lyase were seemingly

up-Fig 11. Total transcript levels in central metabolic pathways with xylose as the carbon source.For reaction names, seeS1 Pathway.

(20)

regulated. Surprisingly, the glyoxysome map revealed that the glyoxylate cycle specific isoci-trate lyase (ICL1) was down-regulated. Upon further investigation, the isozyme present in the mitochondrion, which was mapped to the TCA cycle by the PathoLogic algorithm using Kegg-Kaas annotations, was in factICL2, the 2-methylisocitrate lyase, perhaps confusingly termed

ICL2. AlthoughICL1andICL2share a high sequence similarity, the latter does not use isoci-trate as a subsisoci-trate and does not produce glyoxylate but uses 2-methylisociisoci-trate instead to produce succinate and pyruvate [36]. The gene product fromICL2is part of the propionyl-coe-zyme A pathway, otherwise known as the 2-methylcitrate pathway [37]. The pathway starts withCIT3(YNR001C), the mitochondrial enzyme that condenses oxaloacetate with propionyl

Fig 12. Uncompartmentalised response to xylose in central carbon metabolism in a log2(fold change) scheme.A log2(fold change) is defined as the

log2ratio of transcripts on xylose divided by that on glucose, as reported by CuffDiff. Reactions vETC, vEthylAcetate and vGrowth were manually added to the model. In the case of more than one enzyme that could perform the same function, the largest fold change in expression was used for the colour rendering. For reaction names, seeS1 Pathway.

(21)

coenzyme A to form 2-methylcitrate. Indeed, the citrate synthase that was up-regulated (Fig 12) was not theCIT1gene (YPR001W) that is both cytosolic and mitochondrial and associated with the TCA cycle, but instead the mitochondrialCIT3. Further inspection of the full GO ontology enrichment set revealed that all three key genes in the 2-methylcitrate pathway were significantly up-regulated. The strength of the response in the 2-methylcitrate pathway is given inTable 1, whereCIT3,PDH1andICL2were up-regulated 294, 32 and 28-fold, respectively, andFig 15shows the connection of this pathway with the TCA cycle. The GO term‘propionate catabolic process’(GO:0019543) was found to be significantly enriched with an enrichment score of 5.2. Apart from the last cycle ofβ-oxidation of odd-chain fatty acids in the peroxi-somes as the source of propionyl-CoA, the latter could be derived from propionate or even

Fig 13. Uncompartmentalised response to xylose in central carbon metabolism as a classification scheme.Blue reactions represent those for which more than one enzyme gene has been assigned and for which some were up-regulated and some down-regulated, referred to as isozyme switching. For reaction names, seeS1 Pathway.

(22)

from threonine breakdown [36]. In the gene set for‘threonine metabolic process’

(GO:0006566), only one gene was up-regulated, namely the low specificity L-threonine aldolase

Fig 14. Compartmentalised response to xylose in central metabolism in the cytoplasm using the classification scheme.Blue reactions represent those for which more than one enzyme gene has been assigned and for which some were up-regulated and some down-regulated, referred to as isozyme switching. For reaction names, seeS1 Pathway.

doi:10.1371/journal.pone.0156242.g014

Table 1. Differential expression of the constituent genes mapped to the GO term‘propionate catabolic process’(GO:0019543) on glucose and xylose, respectively.

Id Value

(Glc)

Value (Xyl)

log2 (FC)

qvalue signt Entry Protein names Gene names

GK5S-1542

5.2 1512.1 8.2 0.001 yes P43635 Citrate synthase 3 (EC 2.3.3.16) CIT3 YPR001W YP9723.01

GK5S-1543

27.5 901.5 5.0 0.001 yes Q12428 Probable 2-methylcitrate dehydratase (EC 4.2.1.79)

PDH1 YPR002W LPZ2W YP9723.02

GK5S-2306

15.5 436.9 4.8 0.001 yes Q12031 Mitochondrial 2-methylisocitrate lyase (EC 4.1.3.30)

ICL2 YPR006C LPZ6C YP9723.06C

(23)

(GLY1). Another gene was also annotated asGLY1and down-regulated. Thus, there is no con-clusive evidence that threonine catabolism was up-regulated. The up-regulation of the

2-methylcitrate pathway is, therefore, more likely for the catabolism of short-chain fatty acids and not threonine.

Painted differential expression pathway maps were generated for all of 235 metabolic path-ways found in the annotation. They can be explored in seeS2 Pathway.

Metabolic Regulation Analysis to dissect hierarchical and metabolic

levels of regulation

Differential metabolic flux analysis can be combined with differential gene expression data in the framework of MRA. From Figs10and11it is evident that in anaplerosis, phosphoenolpyr-uvate carboxykinase had very low transcript levels, while phosphoenolpyrphosphoenolpyr-uvate carboxylase was absent from the annotation. Only the pyruvate carboxylase showed substantial transcript levels and was constitutively expressed. Hence, only pyruvate carboxylase was used in the model for flux estimations. In glucose medium, respiro-fermentative metabolism was observed. The specific uptake rate of glucose (on a dry cell weight basis) was 9.4 mmol h-1mg-1with

Fig 15. 2-Methylcitrate pathway.The suggested route of three-carbon units is indicated in magenta.

(24)

acetate and ethanol produced at 2.0 and 2.7 mmol h-1mg-1, respectively with no glycerol for-mation. In xylose medium, acetate, ethanol and glycerol were absent and the specific xylose uptake rate was estimated at approximately 5 mmol h-1mg-1. The ethanol flux calculated could be considered as a conservative estimate, since some evaporation of ethanol could be expected from the aerobic condition.Fig 16shows the differential flux outputs, as approximated by com-bining FBA with the consumption rates of sugars and production rates of ethanol and acetate (seeS5 TableandS3 Pathwayfor the FBA model, parameters and MRA outputs). The xylose utilisation pathways as well as the transaldolases and transketolase fluxes of the non-oxidative PPP were up-regulated, while the direction of the glucose-6-phosphate isomerase was changed towards the direction of glucose-6-phosphate. Notably, the difference in oxidative pentose

Fig 16. Differential flux analysis of central carbon metabolism.Fluxes which differed by less than 10% between conditions were considered constitutive. For reaction names, seeS3 Pathway.

(25)

phosphate pathway flux was only 5% between glucose and xylose utilisation modes and was thus considered constitutive, as was found in the transcript levels. The rest of central metabolic fluxes were down-regulated. (SeeS3 Pathwayfor a more detailed log2(fold change) scheme).

Although transcript levels do not generally correspond well to protein levels and hence to fluxes according to reports using model organisms [38], the overall patterns of expression in central carbon metabolism in our RNA-seq analysis corresponded well with the predicted flux patterns.

Although there is not always a linear correlation between transcript level, protein concentra-tions and maximal activities of an enzyme, we assume differential transcript expression to pro-vide a sufficient approximation to hierarchical regulation, as was done by others [39].Fig 17

shows the separation of metabolic regulation into hierarchical and metabolic level regulation.

Fig 17. Metabolic Regulation Analysis of central carbon metabolism.For reaction names, seeS3 Pathway.

(26)

It is evident that the genetic level up-regulation of xylose reductase, sorbitol dehydrogenase, xylulose kinase, transaldolase and ribose-5-phosphate isomerase rendered the regulation as purely hierarchical. Of these, sorbitol dehydrogenase acting as a xylitol dehydrogenase would be the most extreme example, as it showed a 208-fold genetic up-regulation. These four reac-tions interact with neighbouring reacreac-tions via the metabolites xylulose-5-phosphate, erythrose-4-phosphate, fructose-6-phosphate and ribose-5-phosphate to stimulate fluxes through the transketolases, and to reverse the flux through glucose-6-phosphate isomerase, whereas ribose-5-phsophate isomerase may lower the flux through ribulose-phosphate 3-epimerase by lower-ing the concentration of ribulose-5-phosphate. Notably, regulation of fluxes in lower glycolysis is dominated by the genetic component, whereas upper glycolysis is regulated approximately equally by the metabolic and genetic regulation levels. Downward from pyruvate kinase, regu-lation is dominated by changes in metabolite levels, while some contribution of the genetic reg-ulation level is evident for the fluxes through the aconitate hydratases in the TCA cycle. Due to the absence of measured ethanol and acetate in the xylose medium, the disappearance of fluxes through pyruvate decarboxylase, alcohol dehydrogenase and aldehyde dehydrogenase rendered classification of regulation in these reactions as purely metabolic.

Discussion

Modelling of bioprocesses has been attempted for a long time, in many different forms. It was realised in the second half of the twentieth century that several confounding factors had to be eliminated in the study of metabolism and its gene regulation, arguably with the growth rate as the most important confounding factor, since the expression level of many genes are correlated with the growth rate. Chemostat cultivation is a useful tool, since the growth rate can be easily controlled. However, for the vast majority of industrial applications, chemostats are unrealistic as industrial applications typically involve batch cultivation, or fed-batch cultivation to avoid catabolite repression, improve volumetric productivity and allow sufficient aeration by control-ling the growth rate. Moreover, systems biology has become increasingly data-driven with the availability of high-throughput methods to measure large numbers of intracellular metabolites (metabolomics), proteins (proteomics) and especially RNAs (RNA-seq and microarray) from a single experiment. It is proposed here that much more could be learned in terms of genetic reg-ulation in microorganisms by testing a variety of different cultivation conditions, especially substrates, in small batch experiments as compared to the same amount of work in a more sophisticated, labour intensive chemostat setup, or at least, complementary information. We showed that in a simple and cost-effective batch setup, a large amount of rich RNA-seq data could be generated and fruitfully explored. This small working volume also makes expensive isotope labelling studies feasible. In batch experiments, however, special care needs to be taken in terms of the timing of sampling, as, for instance, fermentation products would accumulate over time and a transcriptome sample late in the fermentation may be more reflective of chem-ical stress than of the metabolic mode. This may especially be a confounding factor when com-paring the effects of the concentration of substrate. Our experimental design alleviated this effect and allowed us to focus on the effects of alternative substrates only.

(27)

compartmentalisation revealed several important features that would have been missed using either on its own.

The vastness of the xylose response under aerobic conditions indicates a different, opportu-nistic lifestyle that this yeast apparently adapts to when cultivated on xylose, which may be reminiscent of its natural environment where a variety of plant-derived substrates may be uti-lised. Down-regulation of many biosynthetic pathways is concordant with a lower growth rate, although there is a notable absence of growth rate specific gene sets in the enrichment statistics. Although many mitochondrial genes were differentially expressed, these are not the major enzymes in energy production in mitochondria. Oxidative stress also seems minimal in the aer-obic xylose medium. Instead, a strong response is seen in the up-regulation of alternative sugar utilisation machinery, including inulinase, sugar transporters and catabolic routes for alterna-tive sugars. Inulin is a fructan stored in large amounts in some plants, includingA.americana

[40]. Our strain was indeed isolated from anA.americanasample.

K.marxianuslacks enzymes such as secreted proteases, lipases and carbohydrate hydro-lases. This yeast may thus be dependent on other fungi in the environment for these functions, or its natural habitat may be some commensal niche where these monomers are supplied by the plant. The transcriptional regulatory basis for this response is likely to be glucose repression by transcription factors such asMIG1binding to carbon source response elements, as was sug-gested to be the case for the inulinase geneKmINU1[4,41].

It is interesting to find that the majority of genes for a complete organelle like the peroxi-some, which is dedicated to lipid oxidation, was dramatically up-regulated in the xylose medium, yet without apparent function in the experimental setting. Thus, glucose de-repres-sion is likely sufficient to activate most of the response to enable lipid catabolism, and stimula-tion by lipids may play a smaller role. The mitochondrial 2-methylcitrate pathway for the degradation of three-carbon molecules was also strongly up-regulated, suggesting that a variety of odd-chain fatty acids originating from peroxisomes may be oxidised.

FBA simulations predicted no significant up-regulation of the oxidative PPP flux, but a more intense up-regulation of the non-oxidative PPP flux and a down-regulation of glycolysis, which were consistent with RNA-seq data. Since biomass formation, the major sink for NADPH, is down-regulated in xylose medium, and another NADPH sink, xylose reductase, is up-regulated, it thus makes sense that the major source flux of NADPH (oxidative PPP) may be similar in both conditions (considering absolute fluxes normalised by the biomass concen-tration). Normalising fluxes to the uptake rate of a carbon source or to the biomass formation rate could thus be misleading when interpreting expression data. FBA simulations here thus shed light on what could be expected in terms of gene regulation.

MRA was used to separate metabolic regulation into hierarchical and metabolic levels. Reac-tions in lower glycolysis are known to a have high flux capacity due to high enzyme concentraReac-tions, and our transcript data also indicated this feature on glucose. Chemostat studies ofS.cerevisiae

(28)

Reporter metabolite networking suggested that NAD(H), acetaldehyde, ethanol, glyceralde-hyde-3-phosphate, 3-phosphoglycerate and hydrogen peroxide formed a strongly interconnec-ted redox active system, with aldehyde dehydrogenases and alcohol dehydrogenases being the main players. This system may work across membranes, since acetaldehyde, ethanol and hydrogen peroxide can cross membranes, shuttles connect NAD(H) across compartments, and glyceraldehyde-3-phosphate and 3-phosphoglycerate are closely connected to NADH. At this stage it is not possible to resolve the fluxes and MRA through the acetaldehyde-dependent pyruvate dehydrogenase bypass. It was, however, evident that dramatic changes occurred in the transcript levels of the alcohol dehydrogenases and aldehyde dehydrogenases, including compartment-specific isozyme switching. As also suggested by Lertwattanasakul et al. [3], etha-nol may be catabolised in the mitochondria as fast as it is produced. However, it has to be emphasised that care needs to be taken in extrapolating gene expression data between studies carried out under aerobic and anaerobic conditions. Oxygen limitation results in differential expression of many genes in yeasts.

Conclusions

We believe to have captured in a unique manner, and from a number of perspectives, a complex transcriptional pattern telling an interesting story about how the cell‘explores its options’when the nutrient availability changes under aerobic conditions. Strong up-regulation of transporters and pathways for utilisation of alternative carbohydrates was evident. In addition, the more opportunistic lifestyle was supported by invasive growth, and sexual reproduction was activated as a long-term survival strategy. The strong peroxisomal fatty acid catabolic response accompa-nied by the mitochondrial 2-methylcitrate pathway is likely explained by glucose de-repression, similar to that seen for carbohydrate utilisation. AsK.marxianusseemingly lacks the secreted enzymes required for depolymerisation of biopolymers, the species is probably dependent on other species for supply of monomers such as sugars, amino acids and free fatty acids, whereas inulinase is a specialist feature enabling this species to utilise this plant storage oligosaccharide. It would be interesting to see whether xylose may have a stimulatory role as suggested recently for

Saccharomyces cerevisiae[43] and through which signaling pathways this may take place. MRA was demonstrated here as an informative method to dissect the regulation of fluxes into the metabolic and hierarchical levels. It is evident that the genetic level plays a dominating role in the regulation of fluxes in central carbon metabolism, not only in the early enabling steps of utilisation of xylose as the carbon source, but also in the high capacity reactions of lower glycolysis. In kinetic modelling of metabolism, emphasis should thus be placed on genetic regulation, which is currently very challenging. Multiple omics would need to be com-bined to predict such regulatory networks and build realistic models. However, in order to resolve fluxes originating from pyruvate, isotopic tracer studies would be required, which cur-rently is under investigation. In addition, the isozyme switching observed with alcohol dehy-drogenase, aldehyde dehydrogenase and acetate-CoA ligase calls for a detailed investigation into the compartmentalisation and kinetics of these enzymes, and detailed bottom-up kinetic modelling to understand the role of this interesting behaviour.

Supporting Information

S1 Draft Genome. Draft genome forK.marxianusUFS-Y2791 reconstructedde novo.

(FA)

S1 Network. Pathway-to-pathway Network.

(29)

S2 Network. Molecule-to-molecule network.

(CDF)

S3 Network. Enzyme-Reporter metabolite network.

(CDF)

S4 Network. Enzyme-reporter metabolite network (NAD).

(CDF)

S5 Network. Enzyme-reporter metabolite network (NADP).

(CDF)

S1 ORF. Putative open reading frames found in the genome.

(GFF)

S1 Pathway. RNA-seq data mapping to central metabolism.

(PPTX)

S2 Pathway. Pathway multi-map.

(PDF)

S3 Pathway. Estimated metabolic flux maps.

(PPTX)

S1 Proteins. Amino acid sequences.

(AA)

S1 Table. Main data tables.

(XLSX)

S2 Table. Enrichment statistics of GO cellular_component.

(XLSX)

S3 Table. Enrichment statistics of GO molecular_function.

(XLSX)

S4 Table. Enrichment statistics of GO biochemical_process.

(XLSX)

S5 Table. Metabolic flux model and Metabolic Regulation Analysis.

(XLSX)

S1 Text. Supplementary text on gene set enrichment and extracellular enzymes.

(PDF)

Acknowledgments

The authors would like to thank Dr. Jonathan Featherston, Dr. Dirk Swanevelder, Prof. Jasper Rees and Ms Minique De Castro at the Onderstepoort Biotechnology Platform, Pretoria, South Africa for sequencing work and fruitful discussions.

Author Contributions

(30)

References

1. Groeneveld P, Stouthamer AH, Westerhoff HV. Super life–how and why‘cell selection’leads to the fast-est-growing eukaryote. FEBS J. 2009; 276: 254–270. doi:10.1111/j.1742-4658.2008.06778.xPMID:

19087200

2. Rocha SN, Abrahão-Neto J, Gombert AK. Physiological diversity within theKluyveromyces marxianus species. Antonie van Leeuwenhoek. 2011; 100: 619–630. doi:10.1007/s10482-011-9617-7PMID:

21732033

3. Lertwattanasakul N, Kosaka T, Hosoyama A, Suzuki Y, Rodrussamee N, Matsutani M, et al. Genetic basis of the highly efficient yeastKluyveromyces marxianus: complete genome sequence and tran-scriptome analyses. Biotechnol. Biofuels. 2015; 8(47). doi:10.1186/s13068-015-0227-x

4. Gao J, Yuan W, Li Y, Xiang R, Hou S, Zhong S, Bai F. Transcriptional analysis ofKluyveromyces marx-ianusfor ethanol production from inulin using consolidated bioprocessing technology. Biotechnol. Bio-fuels. 2015; 8:(115). doi:10.1186/s13068-015-0295-y

5. van Dijk EL, Auger H, Yan Jaszczyszyn Y, Thermes C. Ten years of next-generation sequencing tech-nology. Trends Genet. 2014; 30(9): 418–426. doi:10.1016/j.tig.2014.07.001PMID:25108476

6. Buermans HPJ, den Dunnen JT. Next generation sequencing technology: Advances and applications. Biochim. Biophys. Acta. 2014; 1842: 1932–1941. doi:10.1016/j.bbadis.2014.06.015PMID:24995601

7. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan GJ, van Baren M, et al. Transcript assembly and abundance estimation from RNA-Seq reveals thousands of new transcripts and switching among iso-forms. Nat. Biotechnol. 2010; 28(5): 511–515. doi:10.1038/nbt.1621PMID:20436464

8. Gao QG, Jin K, Ying S, Zhang Y, Xiao G, Shang Y, et al. Genome Sequencing and Comparative Tran-scriptomics of the Model Entomopathogenic FungiMetarhizium anisopliaeandM.acridum. PLOS Genet. 2011; 1(7). doi:10.1371/journal.pgen.1001264

9. Maciejewski H. Gene set analysis methods: statistical models and methodological differences. Brief-ings Bioinf. 2013; 15: 504–518. doi:10.1093/bib/bbt002

10. Karp PD, Paley SM, Krummenacker M, Latendresse M, Dale JM, Lee TJ, et al. PathwayTools version 13.0: integrated software for pathway/genome informatics and systems biology. Briefings Bioinf. 2009; 11(1): 40–79. doi:10.1093/bib/bbp043

11. Tang J. Microbial metabolomics. Curr. Genomics. 2011; 12: 391–403. doi:10.2174/ 138920211797248619PMID:22379393

12. Fiehn O. Metabolomics–the link between genotypes and phenotypes. Plant Mol. Biol. 2002; 48: 155–

171. PMID:11860207

13. Patil KR, Nielsen J. Uncovering transcriptional regulation of metabolism by using metabolic network topology. Proc. Natl. Acad. Sci. U. S. A. 2005; 102(8): 2685–2689. PMID:15710883

14. Oliveira AP, Patil KR, Nielsen J. Architecture of transcriptional regulatory circuits is knitted over the topology of bio-molecular interaction networks. BMC Syst. Biol. 2008; 2(17). doi: 10.1186/1752-0509-2-17

15. Schilling CH, Edwards JS, Palsson BO. Toward metabolic phenomics: analysis of genomic data using flux balances. Biotechnol. Prog. 1999; 15: 288–295. PMID:10356245

16. Covert MW, Schilling CH, Famili I, Edwards JS, Goryanin II, Selkov E, et al. Metabolic modeling of microbial strainsin silico. Trends Biochem. Sci. 2001; 26(3): 179–186. pii: S0968-0004(00)01754-0. doi:10.1016/S0968-0004(00)01754-0PMID:11246024

17. Dias O, Pereira R, Gombert AK Ferreira EC, Rocha I. iOD907, the first genome-scale metabolic model for the milk yeastKluyveromyces lactis. Biotechnol. J. 2014; 9(6): 776–790. doi:10.1002/biot. 201300242PMID:24777859

18. ter Kuile BH, Westerhoff HV. Transcriptome meets metabolome: hierarchical and metabolic regulation of the glycolytic pathway. FEBS lett. 2001; 500. pii: S0014-5793(01)02613-8.

19. du Preez JC, van der Walt JP. Fermentation of D-xylose to ethanol by a strain ofCandida shehatae. Biotechnol. Lett. 1983; 5(3): 357–362.

20. Bolger AM, Lohse M, Usadel B. Trimmomatic: A flexible trimmer for Illumina Sequence Data. Bioinfor-matics. 2014; 30(15): 2114–2120. doi:10.1093/bioinformatics/btu170PMID:24695404

21. Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, et al. Differential gene and transcript expres-sion analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 2013; 7(3): 562–578. doi:10.1038/nprot.2012.016

22. Ideker T, Ozier O, Schwikowski B, Siegel AF. Discovering regulatory and signaling circuits in molecular interaction networks. Bioinformatics. 2002; 18(1): 233–240.

Referências

Documentos relacionados

Revista Científica Eletrônica de Medicina Veterinária é uma publicação semestral da Faculdade de Medicina Veterinária e Zootecnia de Garça – FAMED/FAEF e Editora FAEF,

Due to the similarity between allergic hypersensitivity reactions and non-allergic hypersensitivity reactions and hyperammonemia, the differential diagnosis of the reactions related

(C) The transcript levels of KRP OE lines were examined by RT-PCR-based DNA gel blot analysis and showed highest transcript levels in the KRP3 OE in the KRP7 OE line

Objectives : To assess the prevalence of the main metabolic and anatomical changes and the chemical analysis of stone found in patients with nephrolithiasis in the West

tuberculosis in the chemostat, and metabolites associated with amino acid metabolism were found to be associated with down-regulated genes in the same dataset (a response shared

Based on elementary flux mode analysis, we combine sequence information with metabolic pathway analysis and include, as a novel aspect, circadian regulation.. While minimizing the

The improvement of adapted metabolic reactions under sport training depends on the appropriate increase of regulatory enzymes within the glycolytic and oxidative pathways.. The

Figure 2 - Effect of drying rate on primary root protrusion (a) , percentage of normal seedlings (%) (b) and germination speed index (GSI) (c) of Campomanesia adamantium