mlstatpy.nlp.completion#

class mlstatpy.nlp.completion.CompletionTrieNode(value, leave, weight=1.0, disp=None)[source][source]#

Node definition in a trie used to do completion, see Complétion. This implementation is not very efficient about memmory consumption, it does not hold above 200.000 words. It should be done another way (cython, C++).

all_completions() List[Tuple[CompletionTrieNode, List[str]]][source][source]#

Retrieves all completions for a node, the method does not need @see me precompute_stat to be run first.

all_mks_completions() List[Tuple[CompletionTrieNode, List[CompletionTrieNode]]][source][source]#

Retrieves all completions for a node, the method assumes @see me precompute_stat was run.

static build(words) CompletionTrieNode[source][source]#

Builds a trie.

Paramètres:

words – list of (word) or (weight, word) or (weight, word, display string)

Renvoie:

root of the trie (CompletionTrieNode)

find(prefix: str) CompletionTrieNode[source][source]#

Returns the node which holds all completions starting with a given prefix.

Paramètres:

prefix – prefix

Renvoie:

node or None for no result

items() Iterator[Tuple[float, str, CompletionTrieNode]][source][source]#

Iterates on children, iterates on weight, key, child.

items_list() List[CompletionTrieNode][source][source]#

All children nodes inluding itself in a list.

@return list[

iter_leaves(max_weight=None) Iterator[Tuple[float, str]][source][source]#

Iterators on leaves sorted per weight, yield weight, value.

@param max_weight keep all value under this threshold or None for all

leaves() Iterator[CompletionTrieNode][source][source]#

Iterators on leaves.

min_dynamic_keystroke(word: str) Tuple[int, int][source][source]#

Returns the dynamic minimum keystrokes for a word.

@param word word @return number, length of best prefix, iteration it stops moving

This function must be called after @see me precompute_stat and @see me update_stat_dynamic. See Dynamic Minimum Keystroke.

\begin{eqnarray*} K(q, k, S) &=& \min\acc{ i | s_i \succ q[1..k], s_i \in S } \\ M'(q, S) &=& \min_{0 \infegal k \infegal l(q)} \acc{ M'(q[1..k], S) + K(q, k, S) | q[1..k] \in S } \end{eqnarray*}
min_dynamic_keystroke2(word: str) Tuple[int, int][source][source]#

Returns the modified dynamic minimum keystrokes for a word.

@param word word @return number, length of best prefix, iteration it stops moving

This function must be called after @see me precompute_stat and :meth`update_stat_dynamic`. See Modified Dynamic Minimum Keystroke.

\begin{eqnarray*} K(q, k, S) &=& \min\acc{ i | s_i \succ q[1..k], s_i \in S } \\ M"(q, S) &=& \min \left\{ \begin{array}{l} \min_{1 \infegal k \infegal l(q)} \acc{ M"(q[1..k-1], S) + 1 + K(q, k, S) | q[1..k] \in S } \\ \min_{0 \infegal k \infegal l(q)} \acc{ M"(q[1..k], S) + \delta + K(q, k, S) | q[1..k] \in S } \end{array} \right . \end{eqnarray*}
min_keystroke(word: str) Tuple[int, int][source][source]#

Returns the minimum keystrokes for a word without optimisation, this function should be used if you only have a couple of values to computes. You shoud use @see me min_keystroke0 to compute all of them.

@param word word @return number, length of best prefix

See Problème d’optimisation.

\begin{eqnarray*} K(q, k, S) &=& \min\acc{ i | s_i \succ q[1..k], s_i \in S } \\ M(q, S) &=& \min_{0 \infegal k \infegal l(q)} k + K(q, k, S) \end{eqnarray*}
min_keystroke0(word: str) Tuple[int, int][source][source]#

Returns the minimum keystrokes for a word.

Paramètres:

word – word

Renvoie:

number, length of best prefix, iteration it stops moving

This function must be called after precompute_stat() and update_stat_dynamic().

See Problème d’optimisation.

\begin{eqnarray*} K(q, k, S) &=& \min\acc{ i | s_i \succ q[1..k], s_i \in S } \\ M(q, S) &=& \min_{0 \infegal k \infegal l(q)} k + K(q, k, S) \end{eqnarray*}
precompute_stat()[source][source]#

Computes and stores list of completions for each node, computes mks.

@param clean clean stat

property root#

Returns the initial node with no parent.

str_all_completions(maxn=10, use_precompute=True) str[source][source]#

Builds a string with all completions for all prefixes along the paths.

Paramètres:
  • maxn – maximum number of completions to show

  • use_precompute – use intermediate results built by @see me precompute_stat

Renvoie:

str

unsorted_iter()[source][source]#

Iterates on all nodes.

update_stat_dynamic(delta=0.8)[source][source]#

Must be called after @see me precompute_stat and computes dynamic mks (see Dynamic Minimum Keystroke).

Paramètres:

delta – parameter \(\delta\) in defintion Modified Dynamic KeyStroke

Renvoie:

number of iterations to converge