Internal Documentation

Module Markovify

The following are the private symbols from the module Markovify. Most of the users shouldn't really need those.

Markovify.Token — Constant.

Token{T} = Union{Symbol, T}

Tokens can be of any type. They can also include symbols :begin and :end which are used to denote the beginning and end of a suptoken.

Markovify.State — Type.

State{T} = Vector{Token{T}}

A state is described by a succession of tokens.

Markovify.TokenOccurences — Type.

TokenOccurences{T} = Dict{Token{T}, Int}

A dictionary pairing tokens (or special symbols :begin and :end) with the number of their respective occurences.

Markovify.append_token — Method.

append_token(state, token)

Drop the first element in state and append the token at the end of the state array.

Markovify.begseq — Method.

begseq(n)

Return the symbol :begin repeated n times. This array is then used as a starting sequence for all suptokens.

Markovify.indexin — Method.

indexin(array)

Given a sorted array, return the index on which n would be inserted in should the insertion preserve the sorting.

Markovify.next_token — Method.

next_token(model, state)

Return a token which will come after the current state, at random. The probabilities of individual tokens getting choosed are skewed by their individual values in the TokenOccurences dictionary of the current state, that is obtained from the model.

Markovify.randkey — Method.

randkey(dict)

Return a random key from dict. The probabilities of individual keys getting chosen are skewed by their respective values.

Markovify.state_with_prefix — Method.

state_with_prefix(model, prefix; strict=false)

Attempts to return a random valid state of model that begins with tokens. If strict is false and the model doesn't have any state that begins with tokens, the function shortens the tokens (cuts the last token) to lower the requirements and tries to find some valid state again.

Markovify.states_with_suffix — Method.

states_with_suffix(model, init_suffix)

Return all of the states of model that end with init_suffix. If the number of such states is 1 (or 0), the function shortens the suffix (cuts the first token) in order to lower the requirements, and makes another try.

Markovify.stdweight — Method.

stdweight(state, token)

A constant 1. Used as a placeholder function in Model to represent unbiased weight function.

Markovify.walker — Function.

walker(model, init_state, init_accum, newstate=append_token)

Return an array of tokens obtained by a random walk through the Markov chain. The walk starts at state init_state and ends once a special token :end is reached. A function newstate of general type func(::State{T}, ::Token{T})::State{T} where T can be supplied to be used to generate a new state given the old state and the following token.

This is a general function which is used by all the walk functions.

Module Markovify.Tokenizer

The following are the private symbols from the module Markovify.Tokenizer.

Tokenizer.to_words — Method.

to_words(tokens::Vector{<:AbstractString}; keeppunctuation=true)

Split all of the tokens in tokens into individual words by whitespace. If keeppunctuation is true, all of the special characters are preserved (and thus "glued" to the preceding/following word).

source