Building an epistasis pipeline¶
The EpistasisPipeline
object allows you to link epistasis models in series.
Define each mode and add them to a pipeline. When fit
is called,
this object runs a cascade of fit_transforms.
This is particularly useful if you need to remove nonlinearity in a genotype-phenotype map before fitting high-order epistasis (see this paper).
EpistasisPipeline
inherits Python’s list
type. This means you can
append, prepend, pop, etc. from the pipeline after initialization. Each model is
fit in the order it appears in the pipeline.
Simple Example¶
In the example below, the power transform linearizes the map, then fits specific high-order epistasis on the linear scale. The fitted model is then used to predict the phenotype of an unknown genotype.
from epistasis import EpistasisPipeline
from epistasis.models import (EpistasisPowerTransform,
EpistasisLinearRegression)
# Define genotype-phenotype map.
gpm = GenotypePhenotyeMap(
wildtype='AA'
genotypes=['AA', 'AV','VV'], # Note that we're missing the 'VA' genotype
phenotypes=[0, .5, 1]
)
# Construct pipeline.
model = EpistasisPipeline([
EpistasisPowerTransform(lmbda=1, A=0, B=0),
EpistasisLinearRegression(order=2)
])
# Fit pipeline.
model.fit()
# Predict missing phenotype of missing genotype.
model.predict(['VA'])