API
Dataloaders
Module for loading data into the ltu-ili pipeline.
- class dataloaders.loaders.NumpyLoader(x: array, theta: array, xobs: array | None = None, thetafid: array | None = None)[source]
A class for loading in-memory data using numpy arrays.
- Parameters:
x (np.array) – Array of training data of shape (Ndata, *data.shape)
theta (np.array) – Array of training parameters of shape (Ndata, *parameters.shape)
xobs (Optional[np.array]) – Array of observed data of shape (*data.shape). Defaults to None.
thetafid (Optional[np.array]) – Array of fiducial parameters of shape (*parameters.shape). Defaults to None.
- get_all_data() array[source]
Returns all the loaded data for training
- Returns:
data
- Return type:
np.array
- get_all_parameters()[source]
Returns all the loaded parameters for training
- Returns:
parameters
- Return type:
np.array
- class dataloaders.loaders.SBISimulator(in_dir: str, xobs_file: str, num_simulations: int, simulator: callable | None = None, save_simulated: bool | None = False, x_file: str | None = None, theta_file: str | None = None, thetafid_file: str | None = None)[source]
Class to run simulations of data and parameters and save results to numpy files. Only works for sbi backend.
- Parameters:
in_dir (str) – path to the location of stored data
xobs_file (str) – filename used for observed x values
num_simulations (int) – number of simulations to run at each call
simulator (callable) – function taking the parameters as an argument and returns data. NOTE: This must take a tuple of parameters and output a torch.Tensor of shape (1, *data.shape).
save_simulated (Optional[bool]) – whether to save simulated data. Concatenates to previous data if True. Defaults to False.
x_file (Optional[str]) – filename of the stored first-round training data
theta_file (Optional[str]) – filename of the stored first-round training parameters
thetafid_file (Optional[str]) – filename used for fiducial parameters
- set_simulator(simulator: callable)[source]
Set the simulator to be used in the inference
- Parameters:
simulator (callable) – function taking the parameters as an argument and returns data
- simulate(proposal: Any) Tuple[array, array][source]
Run simulations give a proposal and returns ($ heta, x$) pairs obtained from sampling the proposal and simulating.
- Parameters:
proposal (Any) – Distribution to sample paramaters from
- Returns:
- Sampled parameters $ heta$ and
simulation-outputs $x$.
- Return type:
Tuple[np.array, np.array]
- class dataloaders.loaders.StaticNumpyLoader(in_dir: str, x_file: str, theta_file: str, xobs_file: str | None = None, thetafid_file: str | None = None)[source]
Loads single numpy files of data and parameters from disk
- Parameters:
in_dir (str) – path to the location of stored data
x_file (str) – filename of the stored training data
theta_file (str) – filename of the stored training parameters
xobs_file (Optional[str]) – filename used for observed x values
thetafid_file (Optional[str]) – filename used for fiducial parameters
- class dataloaders.loaders.SummarizerDatasetLoader(in_dir: str, stage: str, x_root: str, theta_file: str, train_test_split_file: str, param_names: List[str], xobs_file: str | None = None, thetafid_file: str | None = None)[source]
Class to load netCF files of data and a csv of parameters Basically a wrapper for ili-summarizer’s Dataset, with added functionality for loading parameters
- Parameters:
in_dir (str) – path to data directory
stage (str) – whether to load train, test or val data
x_root (str) – root of data files
theta_file (str) – parameter file name
train_test_split_file (str) – file name where train, test, val split idx are stored
param_names (List[str]) – parameters to fit
xobs_file (Optional[str]) – filename used for observed x values
thetafid_file (Optional[str]) – filename used for fiducial parameters
- Raises:
Exception – won’t work when data and parameters don’t have same length
- get_nodes_for_stage(stage: str, train_test_split_file: str) List[int][source]
Get nodes for a given stage (train, test or val)
- Parameters:
stage (str) – either train, test or val
train_test_split_file (str) – file where node idx for each stage are stored
- Returns:
list of idx for stage
- Return type:
List[int]
- load_parameters(param_file: str, nodes: List[int], param_names: List[str]) array[source]
Get parameters for nodes
- Parameters:
param_file (str) – where to find parameters of latin hypercube
nodes (List[int]) – list of nodes to read
param_names (List[str]) – parameters to use
- Returns:
array of parameters
- Return type:
np.array
- class dataloaders.loaders.TorchLoader(train_loader: torch.utils.data.DataLoader, val_loader: torch.utils.data.DataLoader = None, xobs: torch.Tensor | None = None, thetafid: torch.Tensor | None = None)[source]
A class for using TorchDataloaders.
- Parameters:
train_loader (DataLoader) – dataloader for training outputting (data, parameters)
val_loader (DataLoader) – dataloader for validation outputting (data, parameters). Defaults to None.
xobs (Optional[Tensor]) – observed data. Defaults to None.
thetafid (Optional[Tensor]) – fiducial parameters. Defaults to None.
- get_all_data() torch.Tensor[source]
Returns all the loaded data for training. May need to be redefined for complex dataloaders.
- Returns:
data
- Return type:
Tensor
- get_all_parameters()[source]
Returns all the loaded parameters for training. May need to be redefined for complex dataloaders.
- Returns:
parameters
- Return type:
Tensor
Inference
sbi Runners
Module to train posterior inference models using the sbi package
- class inference.runner_sbi.ABCRunner(prior: torch.distributions.Distribution, engine: str, train_args: Dict = {}, out_dir: str | Path = None, device: str = 'cpu', name: str | None = '')[source]
Class to run ABC inference models using the sbi package
- __call__(loader: _BaseLoader, seed: int = None)[source]
Train your posterior and save it to file
- Parameters:
loader (_BaseLoader) – dataloader with stored data-parameter pairs
seed (int) – torch seed for reproducibility
- classmethod from_config(config_path: Path, **kwargs) ABCRunner[source]
Create an sbi runner from a yaml config file
- Parameters:
config_path (Path, optional) – path to config file
**kwargs – optional keyword arguments to overload config file
- Returns:
the sbi runner specified by the config file
- Return type:
- class inference.runner_sbi.SBIRunner(prior: torch.distributions.Distribution, engine: str, nets: List[Callable], train_args: Dict = {}, out_dir: str | Path = None, device: str = 'cpu', embedding_net: torch.nn.Module = None, proposal: torch.distributions.Distribution = None, name: str | None = '', signatures: List[str] | None = None)[source]
Class to train posterior inference models using the sbi package. Follows methodology of:
engine=’NPE’: https://arxiv.org/abs/1905.07488
engine=’NLE’: https://arxiv.org/abs/1805.07226
engine=’NRE’: https://arxiv.org/pdf/2002.03712
- Parameters:
prior (Distribution) – prior on the parameters
engine (str) – type of inference engine to use (NPE, NLE, NRE, or any sbi inference engine; see _setup_engine)
nets (List[Callable]) – list of neural nets for amortized posteriors, likelihood models, or ratio classifiers
embedding_net (nn.Module) – neural network to compress high dimensional data into lower dimensionality
train_args (Dict) – dictionary of hyperparameters for training
out_dir (str, Path) – directory where to store outputs
proposal (Distribution) – proposal distribution from which existing simulations were run, for single round inference only. By default, sbi will set proposal = prior unless a proposal is specified.
name (str) – name of the model (for saving purposes)
signatures (List[str]) – list of signatures for each neural net
- __call__(loader: _BaseLoader, seed: int = None)[source]
Train your posterior and save it to file
- Parameters:
loader (_BaseLoader) – dataloader with stored data-parameter pairs
seed (int) – torch seed for reproducibility
- classmethod from_config(config_path: Path, **kwargs) SBIRunner[source]
Create an sbi runner from a yaml config file
- Parameters:
config_path (Path, optional) – path to config file
**kwargs – optional keyword arguments to overload config file
- Returns:
the sbi runner specified by the config file
- Return type:
- class inference.runner_sbi.SBIRunnerSequential(prior: torch.distributions.Distribution, engine: str, nets: List[Callable], train_args: Dict = {}, out_dir: str | Path = None, device: str = 'cpu', embedding_net: torch.nn.Module = None, proposal: torch.distributions.Distribution = None, name: str | None = '', signatures: List[str] | None = None)[source]
Class to train posterior inference models using the sbi package with multiple rounds.
- Follows methodology of:
engine=’SNPE’: https://arxiv.org/abs/1905.07488
engine=’SNLE’: https://arxiv.org/abs/1805.07226
engine=’SNRE’: https://arxiv.org/pdf/2002.03712
pydelfi Runners
Module to train posterior inference models using the pyDELFI package
- class inference.runner_pydelfi.DelfiRunner(prior: Any, config_ndes: List[Dict], engine: str = 'NLE', engine_kwargs: Dict = {}, train_args: Dict = {}, out_dir: str | Path = None, device: str = 'cpu', name: str | None = '')[source]
Class to train posterior inference models using the pydelfi package. Follows methodology of: https://arxiv.org/abs/1903.00007
- Parameters:
prior (Any) – prior on the parameters
engine_kwargs (Dict) – dictionary of additional keywords for Delfi engine
train_args (Dict) – dictionary of hyperparameters for training
out_dir (str, Path) – directory where to store outputs
- __call__(loader)[source]
Train your posterior and save it to file
- Parameters:
loader (BaseLoader) – dataloader with stored data-parameter pairs
- classmethod from_config(config_path: Path, **kwargs) DelfiRunner[source]
Create an sbi runner from a yaml config file
- Parameters:
config_path (Path, optional) – path to config file.
**kwargs – optional keyword arguments to overload config file
- Returns:
the pyDELFI runner specified by the config file
- Return type:
Lampe Runners
Module to train posterior inference models using the lampe package
- class inference.runner_lampe.LampeRunner(prior: torch.distributions.Distribution, nets: List[Callable], engine: str = 'NPE', train_args: Dict = {}, out_dir: Path = None, device: str = 'cpu', proposal: torch.distributions.Distribution = None, name: str | None = '', signatures: List[str] | None = None)[source]
Class to train NPE posterior inference models using the lampe package. Follows methodology of: https://arxiv.org/abs/1711.01861
- Parameters:
prior (Distribution) – prior on the parameters
nets (List[Callable]) – list of neural nets for amortized posteriors, likelihood models, or ratio classifiers
engine (str) – name of the engine class (NPE only)
train_args (Dict) – dictionary of hyperparameters for training
out_dir (Path) – directory where to store outputs
device (str) – device to run on
proposal (Distribution) – proposal distribution from which existing simulations were run, for single round inference only. By default, we will set proposal = prior unless a proposal is specified.
name (str) – name of the model (for saving purposes)
signatures (List[str]) – list of signatures for each neural net
- __call__(loader: _BaseLoader, seed: int = None)[source]
Train your posterior and save it to file
- Parameters:
loader (_BaseLoader) – dataloader with stored data-parameter pairs
seed (int) – torch seed for reproducibility
- classmethod from_config(config_path: Path, **kwargs) LampeRunner[source]
Create a lampe runner from a yaml config file
- Parameters:
config_path (Path, optional) – path to config file
**kwargs – optional keyword arguments to overload config file
- Returns:
the lampe runner specified by the config file
- Return type:
pydelfi_wrappers
Module providing wrappers for the pydelfi package to conform with the sbi interface.
- class inference.pydelfi_wrappers.DelfiWrapper(*args: Any, **kwargs: Any)[source]
Trainer for a neural posterior ensemble using the pydelfi package. Wrapper for pydelfi.delfi.Delfi which adds some necessary functionality and interface.
- Parameters:
config_ndes (List[Dict]) – list with configurations for each neural
ensemble (posterior model in the)
Other parameters are passed as input to the pydelfi.delfi.Delfi class
- classmethod load_engine(meta_path: str)[source]
Load a DelfiWrapper from metadata file
- Parameters:
meta_path (str) – path to saved metadata
- Returns:
a full Delfi inference model with pre-trained weights
- Return type:
- static load_ndes(config_ndes: List[Dict], n_params: int, n_data: int) List[Callable][source]
Initialize the neural density estimators from configuration yamls.
- Parameters:
config_ndes (List[Dict]) – list with configurations for each neural posterior model in the ensemble
n_params (int) – dimensionality of each parameter vector
n_data (int) – dimensionality of each datapoint
- Returns:
- list of neural posterior models with forward
methods
- Return type:
List[Callable]
- log_posterior_stacked(theta: array, x: array)[source]
Redefinition of Delfi.log_posterior_stacked to do consistent shape handling of theta and x.
- Parameters:
theta (np.array) – parameter vector
x (np.array) – data vector to condition the inference on
- potential(theta: array, x: array)[source]
Returns the log posterior probability of a data vector given parameters. Modification of Delfi.log_prob designed to conform with the form of sbi.utils.posterior_ensemble
- Parameters:
theta (np.array) – parameter vector
x (np.array) – data vector to condition the inference on
- Returns:
log posterior probability
- Return type:
float
- sample(sample_shape: int | tuple, x: array, show_progress_bars=False, num_chains: int = 10, burn_in=200, thin=3, skip_initial_state_check: bool = False) array[source]
Samples from the posterior distribution using MCMC rejection. Modification of Delfi.emcee_sample designed to conform with the form of sbi.utils.posterior_ensemble
- Parameters:
sample_shape (int, tuple[int]) – size of samples to generate with each MCMC walker, after burn-in
x (np.array) – data vector to condition the inference on
show_progress_bars (bool) – whether to print sampling progress
num_chains (int) – number of MCMC chains to run in parallel
burn_in (int) – length of burn-in for MCMC sampling
thin (int) – thinning factor for MCMC sampling
skip_initial_state_check (bool) – whether to skip the initial state check for the MCMC sampler
- Returns:
- array of unique samples of shape (# of samples, # of
parameters), after MCMC rejection
- Return type:
np.array
Validation
Runner
Metrics
Embedding
Module providing compression networks for data.
- class embedding.fcn.FCN(*args: Any, **kwargs: Any)[source]
Fully connected network to compress data.
- Parameters:
n_hidden (List[int]) – number of hidden units per layer
act_fn (str) – activation function to use
Utils
NDEs (Pytorch)
Module to provide loading functions for ndes in various backends.
- All Mixture Density Networks (mdn) have the configuration:
hidden_features (int): width of hidden layers (each MDN has 3 hidden layers) num_components (int): number of Gaussian components in the mixture model
- All flow-based models (maf, nsf, made) have the configuration:
hidden_features (int): width of hidden layers in the coupling layers num_transforms (int): number of coupling layers
Linear classifiers (linear) have no arguments.
- MLP and ResNet classifiers (mlp, resnet) have the configuration:
hidden_features (int): width of hidden layers (each has 2 hidden layers)
- class utils.ndes_pt.LampeEnsemble(*args: Any, **kwargs: Any)[source]
Simple module to wrap an ensemble of NPE models.
- potential(theta: torch.Tensor, x: torch.Tensor) torch.Tensor
- class utils.ndes_pt.LampeNPE(*args: Any, **kwargs: Any)[source]
Simple wrapper to add an embedding network to an NPE model.
- potential(theta: torch.Tensor, x: Any) torch.Tensor
- utils.ndes_pt.load_nde_lampe(model: str, embedding_net: torch.nn.Module = torch.nn.Identity, device: str | None = 'cpu', x_normalize: bool = True, theta_normalize: bool = True, **model_args)[source]
Load an nde from lampe. Models include:
mdn: Mixture Density Network (https://publications.aston.ac.uk/id/eprint/373/1/NCRG_94_004.pdf)
maf: Masked Autoregressive Flow (https://arxiv.org/abs/1705.07057)
nsf: Neural Spline Flow (https://arxiv.org/abs/1906.04032)
cnf: Continuous Normalizing Flow (https://arxiv.org/abs/1810.01367)
nice: Non-linear Independent Components Estimation (https://arxiv.org/abs/1410.8516)
gf: Gaussianization Flow (https://arxiv.org/abs/2003.01941)
sospf: Sum-of-Squares Polynomial Flow (https://arxiv.org/abs/1905.02325)
naf: Neural Autoregressive Flow (https://arxiv.org/abs/1804.00779)
unaf: Unconstrained Neural Autoregressive Flow (https://arxiv.org/abs/1908.05164)
For more info, see zuko at https://zuko.readthedocs.io/en/stable/index.html
- Parameters:
model (str) – model to use. One of: mdn, maf, nsf, ncsf, cnf, nice, sospf, gf, naf.
embedding_net (nn.Module, optional) – embedding network to use. Defaults to nn.Identity().
device (str, optional) – device to use. Defaults to ‘cpu’.
x_normalize (bool, optional) – whether to z-normalize x. Defaults to True.
theta_normalize (bool, optional) – whether to z-normalize theta. Defaults to True.
**model_args – additional arguments to pass to the model.
- utils.ndes_pt.load_nde_sbi(engine: str, model: str, embedding_net: torch.nn.Module = torch.nn.Identity, **model_args)[source]
Load an nde from sbi.
- Parameters:
engine (str) – engine to use. One of: NPE, NLE, NRE, SNPE, SNLE, or SNRE.
model (str) – model to use. One of: mdn, maf, nsf, made, linear, mlp, resnet.
embedding_net (nn.Module, optional) – embedding network to use. Defaults to nn.Identity().
**model_args – additional arguments to pass to the model.
NDEs (Tensorflow)
Module to provide loading functions for ndes in pydelfi.
- All Mixture Density Networks (mdn) have the configuration:
hidden_features (int): width of hidden layers (each MDN has 3 hidden layers) num_components (int): number of Gaussian components in the mixture model
- All flow-based models (maf) have the configuration:
hidden_features (int): width of hidden layers in the coupling layers num_transforms (int): number of coupling layers
- utils.ndes_tf.load_nde_pydelfi(n_params: int, n_data: int, model: str, index: int = 0, **model_args)[source]
Load an nde from pydelfi.
- Parameters:
n_params (int) – dimensionality of parameters
n_data (int) – dimensionality of data points
model (str) – model to use. One of: mdn, maf.
index (int, optional) – index of the nde in the ensemble. Defaults to 0.
**model_args – additional arguments to pass to the model.
Prior Distributions (Pytorch)
Wrapper module to import distributions from torch.distributions and make their configuration easier in the ltu-ili interface.
Specifically, if we’re using a vector of parameters, we want to be able to pass the vector to the log_prob method of the distribution and return a scalar. This is not the default behavior of several distributions in torch.distributions, so we wrap them here.
- class utils.distributions_pt.IndependentBeta(*args: Any, **kwargs: Any)
- class utils.distributions_pt.IndependentCauchy(*args: Any, **kwargs: Any)
- class utils.distributions_pt.IndependentChi2(*args: Any, **kwargs: Any)
- class utils.distributions_pt.IndependentExponential(*args: Any, **kwargs: Any)
- class utils.distributions_pt.IndependentFisherSnedecor(*args: Any, **kwargs: Any)
- class utils.distributions_pt.IndependentGamma(*args: Any, **kwargs: Any)
- class utils.distributions_pt.IndependentGumbel(*args: Any, **kwargs: Any)
- class utils.distributions_pt.IndependentHalfCauchy(*args: Any, **kwargs: Any)
- class utils.distributions_pt.IndependentHalfNormal(*args: Any, **kwargs: Any)
- class utils.distributions_pt.IndependentLaplace(*args: Any, **kwargs: Any)
- class utils.distributions_pt.IndependentLogNormal(*args: Any, **kwargs: Any)
- class utils.distributions_pt.IndependentNormal(*args: Any, **kwargs: Any)
- class utils.distributions_pt.IndependentPareto(*args: Any, **kwargs: Any)
- class utils.distributions_pt.IndependentStudentT(*args: Any, **kwargs: Any)
- class utils.distributions_pt.IndependentTruncatedNormal(*args: Any, **kwargs: Any)
- Distribution
alias of
_UnivariateTruncatedNormal
- class utils.distributions_pt.IndependentUniform(*args: Any, **kwargs: Any)
- class utils.distributions_pt.IndependentVonMises(*args: Any, **kwargs: Any)
- class utils.distributions_pt.IndependentWeibull(*args: Any, **kwargs: Any)
- utils.distributions_pt.Uniform
alias of
IndependentUniform
Prior Distributions (Tensorflow)
Wrapper module to import distributions from pydelfi.priors and make their configuration easier in the sbi interface.
- class utils.distributions_tf.IndependentTruncatedNormal(*args: Any, **kwargs: Any)[source]
Note the pdf and logpdf as implemented in pydelfi are not normalized.
Samplers
Custom samplers for sampling posteriors for Likelihood Estimation and Ratio Estimation models. Currently supports emcee samplers for both sbi and pydelfi backends, and pyro samplers only for the sbi backend.
- class utils.samplers.DirectSampler(posterior: sbi.inference.posteriors.base_posterior.NeuralPosterior)[source]
Sampler class for posteriors with a direct sampling method, i.e. amortized posterior inference models.
- Parameters:
posterior (Posterior) – posterior object to sample from, must have a .sample method allowing for direct sampling.
- sample(nsteps: int, x: Any, progress: bool = False) ndarray[source]
Sample nsteps samples from the posterior, evaluated at data x.
- Parameters:
nsteps (int) – number of samples to draw
x (np.ndarray) – data to evaluate the posterior at
progress (bool, optional) – whether to show progress bar. Defaults to False.
- class utils.samplers.EmceeSampler(posterior: sbi.inference.posteriors.base_posterior.NeuralPosterior, num_chains: int = -1, thin: int = 10, burn_in: int = 100)[source]
Sampler class for emcee’s EnsembleSampler
- Parameters:
posterior (Posterior) – posterior object to sample from, must have a .potential method specifiying the log posterior
num_chains (int, optional) – number of chains to sample from. Defaults to os.cpu_count()-1
thin (int, optional) – thinning factor for the chains. Defaults to 10
burn_in (int, optional) – number of steps to discard as burn-in. Defaults to 100
- sample(nsteps: int, x: ndarray, progress: bool = False, skip_initial_state_check: bool = False) ndarray[source]
Sample nsteps samples from the posterior, evaluated at data x.
- Parameters:
nsteps (int) – number of samples to draw
x (np.ndarray) – data to evaluate the posterior at
progress (bool, optional) – whether to show progress bar. Defaults to False.
skip_initial_state_check (bool, optional) – If True, a check that the initial_state can fully explore the space will be skipped. Defaults to False.
- class utils.samplers.PyroSampler(posterior: sbi.inference.posteriors.base_posterior.NeuralPosterior, num_chains: int = -1, thin: int = 10, burn_in: int = 100, method='slice_np_vectorized')[source]
Sampler class for pyro’s samplers. Integrates with pyro through the sbi backend
- Parameters:
posterior (Posterior) – posterior object to sample from, must have a .potential method specifiying the log posterior
num_chains (int, optional) – number of chains to sample from. Defaults to os.cpu_count()-1
thin (int, optional) – thinning factor for the chains. Defaults to 10
burn_in (int, optional) – number of steps to discard as burn-in. Defaults to 100
method (str, optional) – method to use for sampling. Defaults to ‘slice_np_vectorized’. See sbi documentation for more details.
- sample(nsteps: int, x: ndarray, progress: bool = False) ndarray[source]
Sample nsteps samples from the posterior, evaluated at data x.
- Parameters:
nsteps (int) – number of samples to draw
x (np.ndarray) – data to evaluate the posterior at
progress (bool, optional) – whether to show progress bar. Defaults to False.
- class utils.samplers.VISampler(posterior: sbi.inference.posteriors.base_posterior.NeuralPosterior, dist: str = 'maf', **train_kwargs)[source]
Sampler class for variational inference methods. See https://sbi-dev.github.io/sbi/reference/#sbi.inference.posteriors.vi_posterior.VIPosterior for more details.
- Parameters:
posterior (Posterior) – posterior object to sample from, must have a .potential method specifiying the log posterior
dist (str, optional) – distribution to use for the variational inference. Defaults to ‘maf’.
train_kwargs (dict, optional) – keyword arguments to pass to the posterior’s train method. Defaults to {}.
- sample(nsteps: int, x: ndarray, progress: bool = False) ndarray[source]
Sample nsteps samples from the posterior, evaluated at data x.
- Parameters:
nsteps (int) – number of samples to draw
x (np.ndarray) – data to evaluate the posterior at
progress (bool, optional) – whether to show progress bar. Defaults to False.
Import Utilities
Module with tools for importing classes from modules and initializing them
- utils.import_utils.load_class(module_name: str, class_name: str) Any[source]
General tool to load any class from any module, without initialization.
- Parameters:
module_name (str) – module from which to import class
class_name (str) – class name
- Returns:
the class of choice
- Return type:
class (Any)
- utils.import_utils.load_from_config(config: Dict) Any[source]
General tool to load and initialize any class from any module with given configuration.
- Parameters:
config (Dict) – dictionary with the configuration for the object of the form: {‘module’: Module name, ‘class’: Class name, ‘args’: Dictionary of initialization arguments}
- Returns:
the object of choice
- Return type:
object (Any)