mlstatpy.nlp.completion_simple¶

class mlstatpy.nlp.completion_simple.CompletionElement(value: str, weight=1.0, disp=None)[source][source]¶

Definition of an element in a completion system, it contains the following members:

value: the completion
weight: a weight or a position, we assume a completion with a lower weight is shown at a lower position
disp: display string (no impact on the algorithm)
mks0*: value of minimum keystroke
mks0_*: length of the prefix to obtain mks0
mks1: value of dynamic minimum keystroke
mks1_: length of the prefix to obtain mks1
mks2: value of modified dynamic minimum keystroke
mks2_: length of the prefix to obtain mks2

Paramètres:

value – value (a character)
weight – ordering (the lower, the first)
disp – original string, use this to identify the node

static empty_prefix()[source][source]¶: return an instance filled with an empty prefix

init_metrics(position: int, completions: List[CompletionElement] | None = None)[source][source]¶

Initializes the metrics.

Paramètres:

position – position in the completion system when prefix is null, position starting from 0
completions – displayed completions, if not None, the method will store them in member _completions

Renvoie:

boolean which indicates there was an update

str_all_completions(maxn=10, use_precompute=True) → str[source][source]¶

builds a string with all completions for all prefixes along the paths, this is only available if parameter completions was used when calling method :meth`update_metrics`.

Paramètres:

maxn – maximum number of completions to show
use_precompute – use intermediate results built by @see me precompute_stat

Renvoie:

str

str_mks() → str[source][source]¶: return a string with metric information

str_mks0() → str[source][source]¶: return a string with metric information

update_metrics(prefix: str, position: int, improved: dict, delta: float, completions: List[CompletionElement] | None = None, iteration=-1)[source][source]¶

Updates the metrics.

Paramètres:

prefix – prefix
position – position in the completion system when prefix has length k, position starting from 0
improved – if one metrics is < to the completion length, it means it can be used to improve others queries
delta – delta in the dynamic modified mks
completions – displayed completions, if not None, the method will store them in member _completions
iteration – for debugging purpose, indicates when this improvment was detected

Renvoie:

boolean which indicates there was an update

class mlstatpy.nlp.completion_simple.CompletionSystem(elements: List[CompletionElement])[source][source]¶

Defines a completion system.

compare_with_trie(delta=0.8)[source][source]¶

Compares the results with the other implementation.

@param delta parameter delta in the dynamic modified mks @return None or differences

compute_metrics(ffilter=None, delta=0.8, details=False) → int[source][source]¶

Computes the metric for the completion itself.

@param ffilter filter function @param delta parameter delta in the dynamic modified mks @param details log more details about displayed completions @return number of iterations

The function ends by sorting the set of completion by alphabetical order.

enumerate_test_metric(qset: Iterator[Tuple[str, float]]) → Iterator[Tuple[CompletionElement, CompletionElement]][source][source]¶

Evaluates the completion set on a set of queries, the function returns a list of @see cl CompletionElement with the three metrics $M$ , $M'$ , $M"$ for these particular queries.

Paramètres:: qset – list of tuple(str, float) = (query, weight)
Renvoie:: list of tuple of @see cl CompletionElement, the first one is the query, the second one is the None or the matching completion

The method @see me compute_metric needs to be called first.

find(value: str, is_sorted=False) → CompletionElement[source][source]¶

Not very efficient, finds an item in a the list.

Paramètres:

value – string to find
is_sorted – the function will assume the elements are sorted by alphabetical order

Renvoie:

element or None

items() → Iterator[Tuple[str, CompletionElement]][source][source]¶: Iterates on (e.value, e).

sort_values()[source][source]¶: sort the elements by value

sort_weight()[source][source]¶: Sorts the elements by value.

test_metric(qset: Iterator[Tuple[str, float]]) → Dict[str, float][source][source]¶

Evaluates the completion set on a set of queries, the function returns a dictionary with the aggregated metrics and some statistics about them.

@param qset list of tuple(str, float) = (query, weight) @return list of @see cl CompletionElement

The method @see me compute_metric needs to be called first. It then calls @see me enumerate_metric.

to_dict() → Dict[str, CompletionElement][source][source]¶: Returns a dictionary.

tuples() → Iterator[Tuple[float, str]][source][source]¶: Iterates on (e.weight, e.value).