mlstatpy.nlp.completion

class mlstatpy.nlp.completion.CompletionTrieNode(value, leave, weight=1.0, disp=None)[source][source]

Node definition in a trie used to do completion, see Complétion. This implementation is not very efficient about memmory consumption, it does not hold above 200.000 words. It should be done another way (cython, C++).

all_completions() List[Tuple[CompletionTrieNode, List[str]]][source][source]

Retrieves all completions for a node, the method does not need @see me precompute_stat to be run first.

all_mks_completions() List[Tuple[CompletionTrieNode, List[CompletionTrieNode]]][source][source]

Retrieves all completions for a node, the method assumes @see me precompute_stat was run.

static build(words) CompletionTrieNode[source][source]

Builds a trie.

Paramètres:

words – list of (word) or (weight, word) or (weight, word, display string)

Renvoie:

root of the trie (CompletionTrieNode)

find(prefix: str) CompletionTrieNode[source][source]

Returns the node which holds all completions starting with a given prefix.

Paramètres:

prefix – prefix

Renvoie:

node or None for no result

items() Iterator[Tuple[float, str, CompletionTrieNode]][source][source]

Iterates on children, iterates on weight, key, child.

items_list() List[CompletionTrieNode][source][source]

All children nodes inluding itself in a list.

@return list[

iter_leaves(max_weight=None) Iterator[Tuple[float, str]][source][source]

Iterators on leaves sorted per weight, yield weight, value.

@param max_weight keep all value under this threshold or None for all

leaves() Iterator[CompletionTrieNode][source][source]

Iterators on leaves.

min_dynamic_keystroke(word: str) Tuple[int, int][source][source]

Returns the dynamic minimum keystrokes for a word.

@param word word @return number, length of best prefix, iteration it stops moving

This function must be called after @see me precompute_stat and @see me update_stat_dynamic. See Dynamic Minimum Keystroke.

\begin{eqnarray*}
K(q, k, S) &=& \min\acc{ i | s_i \succ q[1..k], s_i \in S } \\
M'(q, S) &=& \min_{0 \leqslant k \leqslant l(q)}
\acc{ M'(q[1..k], S) + K(q, k, S) | q[1..k] \in S }
\end{eqnarray*}

min_dynamic_keystroke2(word: str) Tuple[int, int][source][source]

Returns the modified dynamic minimum keystrokes for a word.

@param word word @return number, length of best prefix, iteration it stops moving

This function must be called after @see me precompute_stat and :meth`update_stat_dynamic`. See Modified Dynamic Minimum Keystroke.

\begin{eqnarray*}
K(q, k, S) &=& \min\acc{ i | s_i \succ q[1..k], s_i \in S } \\
M"(q, S) &=& \min \left\{ \begin{array}{l}
                \min_{1 \leqslant k \leqslant l(q)}
                \acc{ M"(q[1..k-1], S) + 1 + K(q, k, S) | q[1..k]
                \in S } \\
                \min_{0 \leqslant k \leqslant l(q)}
                \acc{ M"(q[1..k], S) + \delta + K(q, k, S) | q[1..k]
                \in S }
                \end{array} \right .
\end{eqnarray*}

min_keystroke(word: str) Tuple[int, int][source][source]

Returns the minimum keystrokes for a word without optimisation, this function should be used if you only have a couple of values to computes. You shoud use @see me min_keystroke0 to compute all of them.

@param word word @return number, length of best prefix

See Problème d’optimisation.

\begin{eqnarray*}
K(q, k, S) &=& \min\acc{ i | s_i \succ q[1..k], s_i \in S } \\
M(q, S) &=& \min_{0 \leqslant k \leqslant l(q)}  k + K(q, k, S)
\end{eqnarray*}

min_keystroke0(word: str) Tuple[int, int][source][source]

Returns the minimum keystrokes for a word.

Paramètres:

word – word

Renvoie:

number, length of best prefix, iteration it stops moving

This function must be called after precompute_stat() and update_stat_dynamic().

See Problème d’optimisation.

\begin{eqnarray*}
K(q, k, S) &=& \min\acc{ i | s_i \succ q[1..k], s_i \in S } \\
M(q, S) &=& \min_{0 \leqslant k \leqslant l(q)}  k + K(q, k, S)
\end{eqnarray*}

precompute_stat()[source][source]

Computes and stores list of completions for each node, computes mks.

@param clean clean stat

property root

Returns the initial node with no parent.

str_all_completions(maxn=10, use_precompute=True) str[source][source]

Builds a string with all completions for all prefixes along the paths.

Paramètres:
  • maxn – maximum number of completions to show

  • use_precompute – use intermediate results built by @see me precompute_stat

Renvoie:

str

unsorted_iter()[source][source]

Iterates on all nodes.

update_stat_dynamic(delta=0.8)[source][source]

Must be called after @see me precompute_stat and computes dynamic mks (see Dynamic Minimum Keystroke).

Paramètres:

delta – parameter \delta in defintion Modified Dynamic KeyStroke

Renvoie:

number of iterations to converge