metasyn.MetaVar
- class metasyn.MetaVar(name, var_type, distribution, dtype='unknown', description=None, prop_missing=0.0, creation_method=None)
Metadata variable describing a column in a MetaFrame.
MetaVar is a structure that holds all metadata needed to generate a synthetic column for it. This is the variable level building block for the MetaFrame. It contains the methods to convert a polars Series into a variable with an appropriate distribution. The MetaVar class is to the MetaFrame what a polars Series is to a DataFrame.
This class is considered a passthrough class used by the MetaFrame class, and is not intended to be used directly by the user.
- Parameters:
var_type (
Optional[str]) – String containing the variable type, e.g. continuous, string, etc.series – Series to create the variable from. Series is None by default and in this case the value is ignored. If it is not supplied, then the variable cannot be fit.
name (
str) – Name of the variable/column.distribution (
BaseDistribution) – Distribution to draw random values from. Can also be set by using the fit method.prop_missing (
float) – Proportion of the series that are missing/NA.dtype (
str) – Type of the original values, e.g. int64, float, etc. Used for type-casting back. The default value is “unknown”.description (
Optional[str]) – User-provided description of the variable.creation_method (
Optional[dict]) – A dictionary that contains information on how the variable was created. If None, it will be assumed to have been created by the user.
- __init__(name, var_type, distribution, dtype='unknown', description=None, prop_missing=0.0, creation_method=None)
- Parameters:
name (str)
var_type (str | None)
distribution (BaseDistribution)
dtype (str)
description (str | None)
prop_missing (float)
creation_method (dict | None)
Methods
__init__(name, var_type, distribution[, ...])draw()Draw a random item for the variable in whatever type is required.
draw_series(n, seed[, progress_bar])Draw a new synthetic series from the metadata.
fit(series[, dist_spec, dist_registry, ...])Fit distributions to the data.
from_dict(var_dict[, plugins])Restore variable from dictionary.
to_dict()Create a dictionary from the variable.