Model file format¶
The primary model definition file is the model.yaml
file. When creating a
new model this file should be placed in a new directory. The following can be
used as a template:
---
name: Escherichia coli test model
biomass: Biomass
extracellular: e
compartments:
- id: e
name: Extracellular
- id: p
name: Periplasm
adjacent_to: [e, c]
- id: c
name: Cytosol
adjacent_to: p
compounds:
- include: ../path/to/ModelSEED_cpds.tsv
format: modelseed
reactions:
- include: reactions/reactions.tsv
- include: reactions/biomass.yaml
exchange:
- include: exchange.yaml
limits:
- include: limits.yaml
model:
- include: model_def.tsv
Biomass¶
The optional biomass
key specifies the default reaction to use for
various analyses (e.g. FBA, FVA, etc.)
Extracellular Compartment¶
The optional extracellular
key specifies the default string for
the extracellular compartment on compounds. If this option is not
specified it will be assumed that the extracellular compartment is called e
.
Default Compartment¶
The optional default_compartment
key specifies the default compartment
that is used if a compound in a reaction does not explicitly specify a
compartment. For example, the reaction |x[e]| + |atp| => |x| + |adp| + |pi|
does not specify a compartment on four of the compounds so those four would
automatically be presumed to be in the default compartment (or c
if no default
compartment is specified).
Compartments¶
The compartments
key is a list of compartment information for the model.
Compartments must always have an id
but can also have additional user
defined properties. The adjacent_to
property is used to define the
boundaries between compartments. Notice that the adjacency can be specified as
a single compartment or a list of compartments. Note that it is sufficient to
specify that p
is adjacent to e
. It is then inferred that e
is
adjacent to p
so it is optional to specify both directions of adjacency.
Compounds¶
The optional compounds
key is a list of compound information. For some
of the model checks the compound information is required. This section can also
include external files that contain compound information. If the file is a
ModelSEED compound table, the format
key must be set to modelseed
. If
the file is a YAML file, the file should have a .yaml
extension. The
following fragment is an example of a YAML formatted compound file:
- id: ac
name: Acetate
formula: C2H3O2
charge: -1
- id: acac
name: Acetoacetate
# ...
The following compound properties are recognized:
Property | Type | Description |
---|---|---|
id | string | Compound ID (required) |
name | string | Name of compound |
formula | string | Compound formula (e.g. C6H12O6) |
charge | integer | Formal charge |
kegg | string | KEGG ID (reference to compound in KEGG database) |
cas | string | CAS number |
Reactions¶
The key reactions
specifies a list of files that will be used to define
the reactions in the model. The reaction files can be formatted as either
tab-separated (.tsv
) or YAML files (.yaml
). The TSV file may be
adequate for most of the reaction definitions while certain particularly
complex reactions (e.g. biomass reaction) may be specified using a YAML file.
The TSV format is a tab-separated table where each row contains the reaction ID
in addition to other data columns. The header must specify the type of each
column. The column equation
will be parsed as ModelSEED reaction equations.
id equation
ADE2t |ade[e]| + |h[e]| <=> |ade[c]| + |ade[c]|
ADK1 |amp| + |atp| <=> (2) |adp|
Any .yaml
or .yml
file in the reactions
specification will be
parsed as a reaction definition file but in YAML format. This format is
particularly useful for very long reactions containing many different compounds
(e.g. the biomass reaction). It also allows adding more annotations because of
the structured nature of the YAML format. The following snippet is an example
of a YAML reaction file:
# Biomass composition
- id: Biomass
equation:
reversible: no
left:
- id: cpd00032 # Oxaloacetate
value: 1
- id: cpd00022 # Acetyl-CoA
value: 1
- id: cpd00035 # L-Alanine
value: 0.02
# ...
right:
- id: Biomass
value: 1
# ...
Reactions in YAML files can also be defined using ModelSEED formatted reaction
equations. The |
is a special character in YAML so the reaction equations
have to be quoted with '
or, alternatively, using the >
for a multiline
quote:
- id: ADE2t
equation: >
|ade[e]| + |h[e]| <=>
|ade[c]| + |h[c]|
- id: ADK1
equation: '|amp| + |atp| <=> (2) |adp|'
The following reaction properties are recognized:
Property | Type | Description |
---|---|---|
id | string | Reaction ID (required) |
name | string | Name of reaction |
equation | string or dict | Reaction equation formula |
ec | string | EC number |
genes | string | Gene association rule |
The genes
property can be used to specifiy which genes enable a reaction.
Complex gene association rules can be used when a reaction is enabled by a
group of genes or when multiple genes can independently enable a reaction:
- id: ADK1
equation: '|amp| + |atp| <=> (2) |adp|'
genes: gene_0001 or (gene_0002 and gene_0003)
Exchange compounds¶
The exchange
key provides a way of defining the compounds that can
enter and exit the model system (the boundary conditions). This includes
compounds that can enter the system (the medium) and compounds that are
allowed to exit the system, like metabolic byproducts. In most cases, all
compounds that occur in the extracellular space should also be defined in the
exchange compounds (with lower limit of zero) so that they are allowed to
leave the model system, and PSAMM will generate a warning if this is not the
case for some compounds. Compounds that are allowed to be taken up
(the medium) should in addition be specified with a negative lower limit
indicating the maximum allowed uptake.
The following fragment is an example of the exchange.yaml
file:
compartment: e # default compartment
compounds:
- id: ac # Acetate
- id: co2
- id: o2
- id: glcD # D-Glucose with uptake limit of 10
lower: -10
# ...
When an exchange file is specified, the corresponding exchange reactions are
automatically added. For example, if the compounds o2
in compartment e
is in the exchange file, the exchange reaction EX_o2_e
is added to the
model. The desired ID for the exchange reaction can be set explicitly using the
reaction
attribute.
The exchange set can also be specified using a TSV-file as the following
fragment shows. The second column specifies the compartment while third and
fourth columns specify the lower and upper bounds, respectively. Both can be
omitted or specified as -
to use the default flux bounds:
# Acetate exchange with default lower and upper bounds
ac e
# D-Glucose with uptake limit of 10
glcD e -10
# CO2 exchange with production limit of 50 and default uptake limit
co2 e - 50
Multiple exchange files can be included from the main exchange.yaml
file,
and these will be combined to form the final set of exchange reactions used for
the simulations.
Reaction flux limits¶
The optional limits
property lists the files that are to be combined and
applied as the reaction flux limits. This can be used to limit certain
reactions in the model. The following fragment is an example of a limits file
in the YAML format. The lower and upper specifies the flux bounds and they are
both optional. The fixed key is a shortcut to set both lower and upper to its
value:
- reaction: ADK1
upper: 10
- reaction: ADE2t
lower: -50
upper: 50
- reaction: DHPTDNRN
fixed: 0
The limits can also be specified using a TSV-file as shown in the following fragment:
# Make ADE2t irreversible by imposing a lower bound of 0
ADE2t 0
# Only allow limited flux on ADK1
ADK1 -10 10
Model Definition¶
The model
property can be used to include a table file that specifies
a subset of reactions that are used in the model. If no model definition file
is given then all the reactions in the model will be used:
ACALD
ACALDt
ACKr
...