metasyn.VarSpec

class metasyn.VarSpec(name, distribution=None, unique=None, privacy=None, prop_missing=None, description=None, data_free=None, var_type=None)

Data class for storing the specifications for variables.

Parameters:
  • name (str) – Name of the variable/column.

  • distribution (Union[dict, type[BaseDistribution], BaseDistribution, DistributionSpec, str, None]) –

    Distribution to use for fitting/finding the distribution. Leave at None to allow metasyn to find the most suitable distribution automatically.

    >>> # Use normal distribution
    >>> distribution="normal"
    >>> # Use normal distribution with mean 0, standard deviation 1
    >>> distribution=NormalDistribution(0, 1)
    

  • unique – To set a column to be unique/key. This is only available for the integer and string datatypes. Setting a variable to unique ensures that the synthetic values generated for this variable are unique. This is useful for ID or primary key variables, for example. The parameter… is ignored when the distribution is set manually. For example: {“unique”: True}, which sets the variable to be unique or {“unique”: False} which forces the variable to be not unique. If the uniqueness is not specified, it is assumed to be not unique, but gives a warning if metasyn thinks it should be.

  • privacy (Optional[BasePrivacy]) – Set the privacy level for a variable, e.g.: DifferentialPrivacy(epsilon=10).

  • prop_missing (Optional[float]) – Proportion of missing values for a variable.

  • description (Optional[str]) – Set the description of a variable.

  • data_free (Optional[bool]) – Whether this variable/column is to be generated from scratch or from an existing column in the dataframe.

  • var_type (Optional[str]) – Manually set the variable type of the columns (used mainly for data_free columns).

__init__(name, distribution=None, unique=None, privacy=None, prop_missing=None, description=None, data_free=None, var_type=None)
Parameters:
  • name (str)

  • distribution (dict | type[BaseDistribution] | BaseDistribution | DistributionSpec | str | None)

  • privacy (BasePrivacy | None)

  • prop_missing (float | None)

  • description (str | None)

  • data_free (bool | None)

  • var_type (str | None)

Methods

__init__(name[, distribution, unique, ...])

from_dict(var_dict)

Create a variable specification from a dictionary.