mlstatpy.nlp.completion_simple#

class mlstatpy.nlp.completion_simple.CompletionElement(value: str, weight=1.0, disp=None)[source][source]#

Definition of an element in a completion system, it contains the following members:

  • value: the completion

  • weight: a weight or a position, we assume a completion with a lower weight is shown at a lower position

  • disp: display string (no impact on the algorithm)

  • mks0*: value of minimum keystroke

  • mks0_*: length of the prefix to obtain mks0

  • mks1: value of dynamic minimum keystroke

  • mks1_: length of the prefix to obtain mks1

  • mks2: value of modified dynamic minimum keystroke

  • mks2_: length of the prefix to obtain mks2

Paramètres:
  • value – value (a character)

  • weight – ordering (the lower, the first)

  • disp – original string, use this to identify the node

static empty_prefix()[source][source]#

return an instance filled with an empty prefix

init_metrics(position: int, completions: List[CompletionElement] | None = None)[source][source]#

Initializes the metrics.

Paramètres:
  • position – position in the completion system when prefix is null, position starting from 0

  • completions – displayed completions, if not None, the method will store them in member _completions

Renvoie:

boolean which indicates there was an update

str_all_completions(maxn=10, use_precompute=True) str[source][source]#

builds a string with all completions for all prefixes along the paths, this is only available if parameter completions was used when calling method :meth`update_metrics`.

Paramètres:
  • maxn – maximum number of completions to show

  • use_precompute – use intermediate results built by @see me precompute_stat

Renvoie:

str

str_mks() str[source][source]#

return a string with metric information

str_mks0() str[source][source]#

return a string with metric information

update_metrics(prefix: str, position: int, improved: dict, delta: float, completions: List[CompletionElement] | None = None, iteration=-1)[source][source]#

Updates the metrics.

Paramètres:
  • prefix – prefix

  • position – position in the completion system when prefix has length k, position starting from 0

  • improved – if one metrics is < to the completion length, it means it can be used to improve others queries

  • delta – delta in the dynamic modified mks

  • completions – displayed completions, if not None, the method will store them in member _completions

  • iteration – for debugging purpose, indicates when this improvment was detected

Renvoie:

boolean which indicates there was an update

class mlstatpy.nlp.completion_simple.CompletionSystem(elements: List[CompletionElement])[source][source]#

Defines a completion system.

compare_with_trie(delta=0.8)[source][source]#

Compares the results with the other implementation.

@param delta parameter delta in the dynamic modified mks @return None or differences

compute_metrics(ffilter=None, delta=0.8, details=False) int[source][source]#

Computes the metric for the completion itself.

@param ffilter filter function @param delta parameter delta in the dynamic modified mks @param details log more details about displayed completions @return number of iterations

The function ends by sorting the set of completion by alphabetical order.

enumerate_test_metric(qset: Iterator[Tuple[str, float]]) Iterator[Tuple[CompletionElement, CompletionElement]][source][source]#

Evaluates the completion set on a set of queries, the function returns a list of @see cl CompletionElement with the three metrics \(M\), \(M'\), \(M"\) for these particular queries.

Paramètres:

qset – list of tuple(str, float) = (query, weight)

Renvoie:

list of tuple of @see cl CompletionElement, the first one is the query, the second one is the None or the matching completion

The method @see me compute_metric needs to be called first.

find(value: str, is_sorted=False) CompletionElement[source][source]#

Not very efficient, finds an item in a the list.

Paramètres:
  • value – string to find

  • is_sorted – the function will assume the elements are sorted by alphabetical order

Renvoie:

element or None

items() Iterator[Tuple[str, CompletionElement]][source][source]#

Iterates on (e.value, e).

sort_values()[source][source]#

sort the elements by value

sort_weight()[source][source]#

Sorts the elements by value.

test_metric(qset: Iterator[Tuple[str, float]]) Dict[str, float][source][source]#

Evaluates the completion set on a set of queries, the function returns a dictionary with the aggregated metrics and some statistics about them.

@param qset list of tuple(str, float) = (query, weight) @return list of @see cl CompletionElement

The method @see me compute_metric needs to be called first. It then calls @see me enumerate_metric.

to_dict() Dict[str, CompletionElement][source][source]#

Returns a dictionary.

tuples() Iterator[Tuple[float, str]][source][source]#

Iterates on (e.weight, e.value).