metasyn.distribution.base
Module serving as the basis for all metasyn distributions.
The base module contains the BaseDistribution class,
which is the base class for all distributions.
It also contains the ScipyDistribution class,
which is a specialized base class for distributions that are built on top of
SciPy’s statistical distributions.
Additionally it contains the UniqueDistributionMixin class,
which is a mixin class that can be used to make a distribution unique
(i.e., one that does not contain duplicate values).
Finally it contains the metadist() decorator,
which is used to set the class attributes of a distribution.
Functions
|
Decorate builtin fitters. |
|
Convert list or pandas series to polars series. |
|
Decorate class to create a distribution with the right properties. |
|
Decorate class to create a fitter with the correct class attributes. |
- metasyn.distribution.base.builtin_fitter(distribution=None, var_type=None, version=None, privacy_type=None)
Decorate builtin fitters.
- Parameters:
distribution (
Optional[type[BaseDistribution]]) – Class that the fitter will return after a succesful fit.var_type (
Union[str,list[str],None]) – Variable type(s) that the fitter implements, e.g. continuous, categorical, string.version (
Optional[str]) – Version of the fitter. Increment this to ensure that compatibility is properly handled.privacy_type (
Optional[str]) – Privacy class/implementation of the fitter.
- Returns:
Class with the appropriate class variables.
- Return type:
cls
- metasyn.distribution.base.convert_to_series(values)
Convert list or pandas series to polars series.
- Return type:
Series- Parameters:
values (ndarray[tuple[Any, ...], dtype[_ScalarT]] | Series)
- metasyn.distribution.base.metadist(name=None, var_type=None, unique=None, version=None)
Decorate class to create a distribution with the right properties.
- Parameters:
name (
Optional[str]) – Name that identifies the distribution uniquely, e.g. core.uniform, core.regex. The name should use a period (.) so that the first part is the namespace (e.g. core), and the second part the name of the distribution.var_type (
Union[str,list[str],None]) – Variable type of the distribution, e.g. continuous, categorical, string.unique (
Optional[bool]) – Whether the distribution is unique or not.version (
Optional[str]) – Version of the distribution. Increment this to ensure that compatibility is properly handled.
- Returns:
Class with the appropriate class variables.
- Return type:
cls
- metasyn.distribution.base.metafit(distribution=None, var_type=None, version=None, privacy_type=None, plugin=None, plugin_version=None)
Decorate class to create a fitter with the correct class attributes.
- Parameters:
distribution (
Optional[type[BaseDistribution]]) – Class that the fitter will return after a succesful fit.var_type (
Union[str,list[str],None]) – Variable type(s) that the fitter implements, e.g. continuous, categorical, string.version (
Optional[str]) – Version of the fitter. Increment this to ensure that compatibility is properly handled.privacy_type (
Optional[str]) – Privacy class/implementation of the fitter.plugin (
Optional[str]) – Name of the plugin for the fitter or builtin (if part of metasyn itself).plugin_version (
Optional[str]) – Version of the plugin used.
- Returns:
Class with the appropriate class variables.
- Return type:
cls
Classes
|
Abstract base class to define a distribution. |
|
Base class for fitters. |
|
Base class for numerical distributions using Scipy. |
|
Base fitter for scipy distributions. |
|
Mixin class to make unique version of base distributions. |
- class metasyn.distribution.base.BaseDistribution
Bases:
ABCAbstract base class to define a distribution.
All distributions should be derived from this class, and should implement the following methods:
_fit(),draw(),_param_dict(),_param_schema(),default_distribution()and__init__.-
name:
str= 'unknown' The identifier for the implemented distribution
-
var_type:
Union[str,Sequence[str]] = 'unknown' The variable type of the distribution
-
unique:
bool= False Whether the distribution creates only unique values
-
version:
str= '1.0' Version of the implemented distribution
- abstractmethod draw()
Draw a random element from the fitted distribution.
- Return type:
object
- draw_reset()
Reset the drawing of elements to start again.
- Return type:
None
- to_dict()
Convert the distribution to a dictionary.
- Return type:
dict
- classmethod schema()
Create sub-schema to validate GMF file.
- Return type:
dict
- classmethod from_dict(dist_dict)
Create a distribution from a dictionary.
- Return type:
BaseDistribution- Parameters:
dist_dict (dict)
- information_criterion(values)
Get the BIC value for a particular set of values.
- Parameters:
values (array_like) – Values to determine the BIC value of.
- Return type:
float
- classmethod matches_name(name)
Check whether the name matches the distribution.
- Parameters:
name (str) – Name to match to the distribution.
- Returns:
Whether the name matches.
- Return type:
bool
- abstractmethod classmethod default_distribution(var_type=None)
Get a distribution with default parameters.
- Return type:
BaseDistribution- Parameters:
var_type (str | None)
- draw_list(n)
Draw a list of values from the distribution.
- Parameters:
n (
int) – Number of items to draw from the distribution.- Raises:
NotImplementedError: – If the distribution hasn’t implemented a draw_list.
- Return type:
list- Returns:
List of values.
-
name:
- class metasyn.distribution.base.BaseFitter(privacy)
Bases:
ABCBase class for fitters.
- Parameters:
privacy (BasePrivacy)
- classmethod matches_name(name)
Check whether the name matches the fitter.
- Parameters:
name (str) – Name to match to the fitter.
- Returns:
Whether the name matches.
- Return type:
bool
- class metasyn.distribution.base.ScipyDistribution
Bases:
BaseDistributionBase class for numerical distributions using Scipy.
This base class makes it easy to implement new numerical distributions. It can also be used for non-Scipy distributions, provided the distribution implements logpdf, rvs and fit methods.
- property n_par: int
Number of parameters for distribution.
- Type:
int
- draw()
Draw a random element from the fitted distribution.
- draw_list(n)
Draw a list of values from the distribution.
- Parameters:
n (
int) – Number of items to draw from the distribution.- Raises:
NotImplementedError: – If the distribution hasn’t implemented a draw_list.
- Return type:
list- Returns:
List of values.
- information_criterion(values)
Get the BIC value for a particular set of values.
- Parameters:
values (array_like) – Values to determine the BIC value of.
- class metasyn.distribution.base.ScipyFitter(privacy)
Bases:
BaseFitterBase fitter for scipy distributions.
- Parameters:
privacy (BasePrivacy)
- class metasyn.distribution.base.UniqueDistributionMixin(*args, **kwargs)
Bases:
BaseDistributionMixin class to make unique version of base distributions.
This mixin class can be used to extend base distribution classes, adding functionality that ensures generated values are unique. It overrides the draw method of the base class, adding a check to prevent duplicate values from being drawn. If a duplicate value is drawn, it retries up to 1e5 times before raising a ValueError.
The UniqueDistributionMixin is used in various unique metasyn distribution variations, such as UniqueFakerDistribution and UniqueRegexDistribution.
- name
unknown
- unique
True
- version
1.0
- var_type
unknown
-
unique:
bool= True Whether the distribution creates only unique values
- draw_reset()
Reset the drawing of elements to start again.
- draw()
Draw a random element from the fitted distribution.
- Return type:
object
- information_criterion(values)
Get the BIC value for a particular set of values.
- Parameters:
values (array_like) – Values to determine the BIC value of.