# Internal Documentation

## Module Markovify

The following are the private symbols from the module `Markovify`

. Most of the users shouldn't really need those.

`Markovify.Token`

— Constant.`Token{T} = Union{Symbol, T}`

Tokens can be of any type. They can also include symbols `:begin`

and `:end`

which are used to denote the beginning and end of a suptoken.

`Markovify.State`

— Type.`State{T} = Vector{Token{T}}`

A state is described by a succession of tokens.

`Markovify.TokenOccurences`

— Type.`TokenOccurences{T} = Dict{Token{T}, Int}`

A dictionary pairing tokens (or special symbols `:begin`

and `:end`

) with the number of their respective occurences.

`Markovify.append_token`

— Method.`append_token(state, token)`

Drop the first element in `state`

and append the `token`

at the end of the `state`

array.

`Markovify.begseq`

— Method.`begseq(n)`

Return the symbol `:begin`

repeated `n`

times. This array is then used as a starting sequence for all suptokens.

`Markovify.indexin`

— Method.`indexin(array)`

Given a sorted `array`

, return the index on which `n`

would be inserted in should the insertion preserve the sorting.

`Markovify.next_token`

— Method.`next_token(model, state)`

Return a token which will come after the current state, at random. The probabilities of individual tokens getting choosed are skewed by their individual values in the `TokenOccurences`

dictionary of the current `state`

, that is obtained from the `model`

.

`Markovify.randkey`

— Method.`randkey(dict)`

Return a random key from `dict`

. The probabilities of individual keys getting chosen are skewed by their respective values.

`Markovify.state_with_prefix`

— Method.`state_with_prefix(model, prefix; strict=false)`

Attempts to return a random valid state of `model`

that begins with `tokens`

. If `strict`

is `false`

and the `model`

doesn't have any state that begins with `tokens`

, the function shortens the tokens (cuts the last token) to lower the requirements and tries to find some valid state again.

`Markovify.states_with_suffix`

— Method.`states_with_suffix(model, init_suffix)`

Return all of the states of `model`

that end with `init_suffix`

. If the number of such states is 1 (or 0), the function shortens the suffix (cuts the first token) in order to lower the requirements, and makes another try.

`Markovify.stdweight`

— Method.`stdweight(state, token)`

A constant `1`

. Used as a placeholder function in `Model`

to represent unbiased weight function.

`Markovify.walker`

— Function.`walker(model, init_state, init_accum, newstate=append_token)`

Return an array of tokens obtained by a random walk through the Markov chain. The walk starts at state `init_state`

and ends once a special token `:end`

is reached. A function `newstate`

of general type `func(::State{T}, ::Token{T})::State{T} where T`

can be supplied to be used to generate a new state given the old state and the following token.

This is a general function which is used by all the `walk`

functions.

## Module Markovify.Tokenizer

The following are the private symbols from the module `Markovify.Tokenizer`

.

`Tokenizer.to_words`

— Method.`to_words(tokens::Vector{<:AbstractString}; keeppunctuation=true)`

Split all of the tokens in `tokens`

into individual words by whitespace. If `keeppunctuation`

is true, all of the special characters are preserved (and thus "glued" to the preceding/following word).