mlstatpy.nlp.completion#
- class mlstatpy.nlp.completion.CompletionTrieNode(value, leave, weight=1.0, disp=None)[source][source]#
Node definition in a trie used to do completion, see Complétion. This implementation is not very efficient about memmory consumption, it does not hold above 200.000 words. It should be done another way (cython, C++).
- all_completions() List[Tuple[CompletionTrieNode, List[str]]] [source][source]#
Retrieves all completions for a node, the method does not need @see me precompute_stat to be run first.
- all_mks_completions() List[Tuple[CompletionTrieNode, List[CompletionTrieNode]]] [source][source]#
Retrieves all completions for a node, the method assumes @see me precompute_stat was run.
- static build(words) CompletionTrieNode [source][source]#
Builds a trie.
- Paramètres:
words – list of
(word)
or(weight, word)
or(weight, word, display string)
- Renvoie:
root of the trie (CompletionTrieNode)
- find(prefix: str) CompletionTrieNode [source][source]#
Returns the node which holds all completions starting with a given prefix.
- Paramètres:
prefix – prefix
- Renvoie:
node or None for no result
- items() Iterator[Tuple[float, str, CompletionTrieNode]] [source][source]#
Iterates on children, iterates on weight, key, child.
- items_list() List[CompletionTrieNode] [source][source]#
All children nodes inluding itself in a list.
@return list[
- iter_leaves(max_weight=None) Iterator[Tuple[float, str]] [source][source]#
Iterators on leaves sorted per weight, yield weight, value.
@param max_weight keep all value under this threshold or None for all
- leaves() Iterator[CompletionTrieNode] [source][source]#
Iterators on leaves.
- min_dynamic_keystroke(word: str) Tuple[int, int] [source][source]#
Returns the dynamic minimum keystrokes for a word.
@param word word @return number, length of best prefix, iteration it stops moving
This function must be called after @see me precompute_stat and @see me update_stat_dynamic. See Dynamic Minimum Keystroke.
\begin{eqnarray*} K(q, k, S) &=& \min\acc{ i | s_i \succ q[1..k], s_i \in S } \\ M'(q, S) &=& \min_{0 \infegal k \infegal l(q)} \acc{ M'(q[1..k], S) + K(q, k, S) | q[1..k] \in S } \end{eqnarray*}
- min_dynamic_keystroke2(word: str) Tuple[int, int] [source][source]#
Returns the modified dynamic minimum keystrokes for a word.
@param word word @return number, length of best prefix, iteration it stops moving
This function must be called after @see me precompute_stat and :meth`update_stat_dynamic`. See Modified Dynamic Minimum Keystroke.
\begin{eqnarray*} K(q, k, S) &=& \min\acc{ i | s_i \succ q[1..k], s_i \in S } \\ M"(q, S) &=& \min \left\{ \begin{array}{l} \min_{1 \infegal k \infegal l(q)} \acc{ M"(q[1..k-1], S) + 1 + K(q, k, S) | q[1..k] \in S } \\ \min_{0 \infegal k \infegal l(q)} \acc{ M"(q[1..k], S) + \delta + K(q, k, S) | q[1..k] \in S } \end{array} \right . \end{eqnarray*}
- min_keystroke(word: str) Tuple[int, int] [source][source]#
Returns the minimum keystrokes for a word without optimisation, this function should be used if you only have a couple of values to computes. You shoud use @see me min_keystroke0 to compute all of them.
@param word word @return number, length of best prefix
\begin{eqnarray*} K(q, k, S) &=& \min\acc{ i | s_i \succ q[1..k], s_i \in S } \\ M(q, S) &=& \min_{0 \infegal k \infegal l(q)} k + K(q, k, S) \end{eqnarray*}
- min_keystroke0(word: str) Tuple[int, int] [source][source]#
Returns the minimum keystrokes for a word.
- Paramètres:
word – word
- Renvoie:
number, length of best prefix, iteration it stops moving
This function must be called after
precompute_stat()
andupdate_stat_dynamic()
.\begin{eqnarray*} K(q, k, S) &=& \min\acc{ i | s_i \succ q[1..k], s_i \in S } \\ M(q, S) &=& \min_{0 \infegal k \infegal l(q)} k + K(q, k, S) \end{eqnarray*}
- precompute_stat()[source][source]#
Computes and stores list of completions for each node, computes mks.
@param clean clean stat
- property root#
Returns the initial node with no parent.
- str_all_completions(maxn=10, use_precompute=True) str [source][source]#
Builds a string with all completions for all prefixes along the paths.
- Paramètres:
maxn – maximum number of completions to show
use_precompute – use intermediate results built by @see me precompute_stat
- Renvoie:
str
- update_stat_dynamic(delta=0.8)[source][source]#
Must be called after @see me precompute_stat and computes dynamic mks (see Dynamic Minimum Keystroke).
- Paramètres:
delta – parameter \(\delta\) in defintion Modified Dynamic KeyStroke
- Renvoie:
number of iterations to converge