Internal Documentation
Module Markovify
The following are the private symbols from the module Markovify
. Most of the users shouldn't really need those.
Markovify.Token
— Constant.Token{T} = Union{Symbol, T}
Tokens can be of any type. They can also include symbols :begin
and :end
which are used to denote the beginning and end of a suptoken.
Markovify.State
— Type.State{T} = Vector{Token{T}}
A state is described by a succession of tokens.
Markovify.TokenOccurences
— Type.TokenOccurences{T} = Dict{Token{T}, Int}
A dictionary pairing tokens (or special symbols :begin
and :end
) with the number of their respective occurences.
Markovify.append_token
— Method.append_token(state, token)
Drop the first element in state
and append the token
at the end of the state
array.
Markovify.begseq
— Method.begseq(n)
Return the symbol :begin
repeated n
times. This array is then used as a starting sequence for all suptokens.
Markovify.indexin
— Method.indexin(array)
Given a sorted array
, return the index on which n
would be inserted in should the insertion preserve the sorting.
Markovify.next_token
— Method.next_token(model, state)
Return a token which will come after the current state, at random. The probabilities of individual tokens getting choosed are skewed by their individual values in the TokenOccurences
dictionary of the current state
, that is obtained from the model
.
Markovify.randkey
— Method.randkey(dict)
Return a random key from dict
. The probabilities of individual keys getting chosen are skewed by their respective values.
Markovify.state_with_prefix
— Method.state_with_prefix(model, prefix; strict=false)
Attempts to return a random valid state of model
that begins with tokens
. If strict
is false
and the model
doesn't have any state that begins with tokens
, the function shortens the tokens (cuts the last token) to lower the requirements and tries to find some valid state again.
Markovify.states_with_suffix
— Method.states_with_suffix(model, init_suffix)
Return all of the states of model
that end with init_suffix
. If the number of such states is 1 (or 0), the function shortens the suffix (cuts the first token) in order to lower the requirements, and makes another try.
Markovify.stdweight
— Method.stdweight(state, token)
A constant 1
. Used as a placeholder function in Model
to represent unbiased weight function.
Markovify.walker
— Function.walker(model, init_state, init_accum, newstate=append_token)
Return an array of tokens obtained by a random walk through the Markov chain. The walk starts at state init_state
and ends once a special token :end
is reached. A function newstate
of general type func(::State{T}, ::Token{T})::State{T} where T
can be supplied to be used to generate a new state given the old state and the following token.
This is a general function which is used by all the walk
functions.
Module Markovify.Tokenizer
The following are the private symbols from the module Markovify.Tokenizer
.
Tokenizer.to_words
— Method.to_words(tokens::Vector{<:AbstractString}; keeppunctuation=true)
Split all of the tokens in tokens
into individual words by whitespace. If keeppunctuation
is true, all of the special characters are preserved (and thus "glued" to the preceding/following word).