Abstract

Flux balance analysis (FBA) is a mathematical approach for analyzing the flow of chemicals through a metabolic network. From the genome-scale metabolic network the stoichiometry of reactions are extracted and arranged into a matrix $S$ in which the columns represents the different reactions and rows represents the unique metabolites. The matrix elements are the stoichiometric constants, with positive sign for products and negative for reagents. The flux through all of the reactions are represented by the vector $v$. The mass balance equation at steady state is represented by the equation $\frac{dx}{dt} = S \cdot v = 0$. Solving this equation gives the desired flux distribution in the network (Orth et al. 2011).

Mathematical basis

In contrast to the traditionally followed approach of metabolic modeling using coupled ordinary differential equations, flux balance analysis requires very little information in terms of the enzyme kinetic parameters and concentration of metabolites in the system. It achieves this by making two assumptions, steady state and optimality. The first assumption is that the modeled system has entered a steady state, where the metabolite concentrations no longer change, i.e. in each metabolite node the producing and consuming fluxes cancel each other out. The second assumption is that the organism has been optimized through evolution for some biological goal, such as optimal growth or conservation of resources. The steady-state assumption reduces the system to a set of linear equations, which is then solved to find a flux distribution that satisfies the steady-state condition subject to the stoichiometry constraints while maximizing the value of a pseudo-reaction (the objective function) representing the conversion of biomass precursors into biomass.

The main advantage of the flux balance approach is that it does not require any knowledge of the metabolite concentrations, or more importantly, the enzyme kinetics of the system; the homeostasis assumption precludes the need for knowledge of metabolite concentrations at any time as long as that quantity remains constant, and additionally it removes the need for specific rate laws since it assumes that at steady state, there is no change in the size of the metabolite pool in the system. The stoichiometric coefficients alone are sufficient for the mathematical maximization of a specific objective function.

The objective function is essentially a measure of how each component in the system contributes to the production of the desired product. The product itself depends on the purpose of the model, but one of the most common examples is the study of total biomass. A notable example of the success of FBA is the ability to accurately predict the growth rate of the prokaryote E. coli when cultured in different conditions. In this case, the metabolic system was optimized to maximize the biomass objective function. However this model can be used to optimize the production of any product, and is often used to determine the output level of some biotechnologically relevant product. The model itself can be experimentally verified by cultivating organisms using a chemostat or similar tools to ensure that nutrient concentrations are held constant. Measurements of the production of the desired objective can then be used to correct the model. (https://en.wikipedia.org/wiki/Flux_balance_analysis)

Extensions and comparison

The success of FBA and the realization of its limitations has led to extensions that attempt to mediate the limitations of the technique. The optimal solution to the flux-balance problem is rarely unique with many possible, and equally optimal, solutions existing. Flux variability analysis (FVA) built into some analysis software, returns the boundaries for the fluxes through each reaction that can, paired with the right combination of other fluxes, estimate the optimal solution. Reactions which can support a low variability of fluxes through them are likely to be of a higher importance to an organism and FVA is a promising technique for the identification of reactions that are important.

When simulating knockouts or growth on media, FBA gives the final steady-state flux distribution. This final steady state is reached in varying time-scales. For example, the predicted growth rate of E. coli on glycerol as the primary carbon source did not match the FBA predictions; however, on sub-culturing for 40 days or 700 generations, the growth rate adaptively evolved to match the FBA prediction. Sometimes it is of interest to find out what is the immediate effect of a perturbation or knockout, since it takes time for regulatory changes to occur and for the organism to re-organize fluxes to optimally utilize a different carbon source or circumvent the effect of the knockout. MOMA (Minimization Of Metabolic Adjusments) predicts the immediate sub-optimal flux distribution following the perturbation by minimizing the distance (Euclidean) between the wild-type FBA flux distribution and the mutant flux distribution using quadratic programming.

ROOM (Regulatory On-Off Minimization) attempts to improve the prediction of the metabolic state of an organism after a gene knockout. It follows the same premise as MOMA that an organism would try to restore a flux distribution as close as possible to the wild-type after a knockout. However it further hypothesizes that this steady state would be reached through a series of transient metabolic changes by the regulatory network and that the organism would try to minimize the number of regulatory changes required to reach the wild-type state. Instead of using a distance metric minimization however it uses a mixed integer linear programming method.

Dynamic FBA attempts to add the ability for models to change over time, thus in some ways avoiding the strict steady state condition of pure FBA. Typically the technique involves running an FBA simulation, changing the model based on the outputs of that simulation, and rerunning the simulation. By repeating this process an element of feedback is achieved over time.

FBA provides a less simplistic analysis than Choke Point Analysis while requiring far less information on reaction rates and a much less complete network reconstruction than a full dynamic simulation would require. In filling this niche, FBA has been shown to be a very useful technique for analysis of the metabolic capabilities of cellular systems. Unlike choke point analysis which only considers points in the network where metabolites are produced but not consumed or vice versa, FBA is a true form of metabolic network modelling because it considers the metabolic network as a single complete entity (the stoichiometric matrix) at all stages of analysis. This means that network effects, such as chemical reactions in distant pathways affecting each other, can be reproduced in the model. The upside to the inability of choke point analysis to simulate network effects is that it considers each reaction within a network in isolation and thus can suggest important reactions in a network even if a network is highly fragmented and contains many gaps.

Unlike dynamic metabolic simulation, FBA assumes that the internal concentration of metabolites within a system stays constant over time and thus is unable to provide anything other than steady-state solutions. It is unlikely that FBA could, for example, simulate the functioning of a nerve cell. Since the internal concentration of metabolites is not considered within a model, it is possible that an FBA solution could contain metabolites at a concentration too high to be biologically acceptable. This is a problem that dynamic metabolic simulations would probably avoid. One advantage of the simplicity of FBA over dynamic simulations is that they are far less computationally expensive, allowing the simulation of large numbers of perturbations to the network. A second advantage is that the reconstructed model can be substantially simpler by avoiding the need to consider enzyme rates and the effect of complex interactions on enzyme kinetics. (https://en.wikipedia.org/wiki/Flux_balance_analysis)

Simulations

FBA is not computationally intensive, taking on the order of seconds to calculate optimal fluxes for biomass production for a typical network (around 2000 reactions). This means that the effect of deleting reactions from the network and/or changing flux constraints can be sensibly modelled on a single computer.

Single reaction deletion

A frequently used technique to search a metabolic network for reactions that are particularly critical to the production of biomass. By removing each reaction in a network in turn and measuring the predicted flux through the biomass function, each reaction can be classified as either essential (if the flux through the biomass function is substantially reduced) or non-essential (if the flux through the biomass function is unchanged or only slightly reduced).

Pairwise reaction deletion

Pairwise reaction deletion of all possible pairs of reactions is useful when looking for drug targets, as it allows the simulation of multi-target treatments, either by a single drug with multiple targets or by drug combinations. Double deletion studies can also quantify the synthetic lethal interactions between different pathways providing a measure of the contribution of the pathway to overall network robustness.

Single and multiple gene deletions

Genes are connected to enzyme-catalyzed reactions by Boolean expressions known as Gene-Protein-Reaction expressions (GPR). Typically a GPR takes the form (Gene A AND Gene B) to indicate that the products of genes A and B are protein sub-units that assemble to form the complete protein and therefore the absence of either would result in deletion of the reaction. On the other hand, if the GPR is (Gene A OR Gene B) it implies that the products of genes A and B are isozymes. Therefore, it is possible to evaluate the effect of single or multiple gene deletions by evaluation of the GPR as a Boolean expression. If the GPR evaluates to false, the reaction is constrained to zero in the model prior to performing FBA. Thus gene knockouts can be simulated using FBA.

Interpretation of gene and reaction deletion results

The utility of reaction inhibition and deletion analyses becomes most apparent if a gene-protein-reaction matrix has been assembled for the network being studied with FBA. The gene-protein-reaction matrix is a binary matrix connecting genes with the proteins made from them. Using this matrix, reaction essentiality can be converted into gene essentiality indicating the gene defects which may cause a certain disease phenotype or the proteins/enzymes which are essential (and thus what enzymes are the most promising drug targets in pathogens). However, the gene-protein-reaction matrix does not specify the Boolean relationship between genes with respect to the enzyme, instead it merely indicates an association between them. Therefore, it should be used only if the Boolean GPR expression is unavailable.

Reaction inhibition

The effect of inhibiting a reaction, rather than removing it entirely, can be simulated in FBA by restricting the allowed flux through it. The effect of an inhibition can be classified as lethal or non-lethal by applying the same criteria as in the case of a deletion where a suitable threshold is used to distinguish “substantially reduced” from “slightly reduced”. Generally the choice of threshold is arbitrary but a reasonable estimate can be obtained from growth experiments where the simulated inhibitions/deletions are actually performed and growth rate is measured.

Growth media optimization

To design optimal growth media with respect to enhanced growth rates or useful by-product secretion, it is possible to use a method known as Phenotypic Phase Plane analysis. PhPP involves applying FBA repeatedly on the model while co-varying the nutrient uptake constraints and observing the value of the objective function (or by-product fluxes). PhPP makes it possible to find the optimal combination of nutrients that favor a particular phenotype or a mode of metabolism resulting in higher growth rates or secretion of industrially useful by-products. The predicted growth rates of bacteria in varying media have been shown to correlate well with experimental results, as well as to define precise minimal media for the culture of Salmonella typhimurium.