metasyn.distribution.freetext.FreeTextDistribution

class metasyn.distribution.freetext.FreeTextDistribution(locale, avg_sentences, avg_words)

Free text distribution.

This distribution detects the language and generates sentences using the Faker package. The average number of sentences and words per item are detected using regexes (with the lingua package).

Parameters:
  • locale (str) – Locale used for the faker package.

  • avg_sentences (float) – Average number of sentences (punctuation marks) per (non-NA) row, if None do not make sentences.

  • avg_words (float) – Average number of words per (non-NA) row.

name

core.freetext

unique

False

version

1.0

var_type

string

__init__(locale, avg_sentences, avg_words)
Parameters:
  • locale (str)

  • avg_sentences (float)

  • avg_words (float)

Methods

__init__(locale, avg_sentences, avg_words)

default_distribution([var_type])

Get a distribution with default parameters.

draw()

Draw a random element from the fitted distribution.

draw_list(n)

Draw a list of values from the distribution.

draw_reset()

Reset the drawing of elements to start again.

from_dict(dist_dict)

Create a distribution from a dictionary.

information_criterion(values)

Get the BIC value for a particular set of values.

matches_name(name)

Check whether the name matches the distribution.

provides_var_type(var_type)

schema()

Create sub-schema to validate GMF file.

to_dict()

Convert the distribution to a dictionary.

Attributes

name

The identifier for the implemented distribution

unique

Whether the distribution creates only unique values

var_type

The variable type of the distribution

version

Version of the implemented distribution