class, y_score=None, sample_weight=None, df=None)[source][source]#

Helper to draw a ROC curve.

Initialisation with a dataframe and two or three columns:

  • column 1: score (y_score)

  • column 2: expected answer (boolean) (y_true)

  • column 3: weight (optional) (sample_weight)

  • y_true – if df is None, y_true, y_score, sample_weight must be filled, y_true is whether or None the answer is true. y_true means the prediction is right.

  • y_score – score prediction

  • sample_weight – weights

  • df – dataframe or array or list, it must contains 2 or 3 columns always in the same order

class CurveType(value)[source][source]#

Curve types:

  • PROBSCORE: 1 - False Positive / True Positive

  • ERRPREC: error / recall

  • RECPREC: precision / recall

  • ROC: False Positive / True Positive

  • SKROC: False Positive / True Positive (scikit-learn)

property Data#

Returns the underlying dataframe.


Computes the area under the curve (AUC).


cloud – data or None to use, the function assumes the data is sorted.



The first column is the label, the second one is the score, the third one is the weight.

auc_interval(bootstrap=10, alpha=0.95)[source][source]#

Determines a confidence interval for the AUC with bootstrap.

@param bootstrap number of random estimations @param alpha define the confidence interval @return dictionary of values

compute_roc_curve(nb=100, curve=CurveType.ROC, bootstrap=False)[source][source]#

Computes a ROC curve with nb points avec nb, if nb == -1, there are as many as points as the data contains, if bootstrap == True, it draws random number to create confidence interval based on bootstrap method.

@param nb number of points for the curve @param curve see CurveType @param bootstrap builds the curve after resampling @return DataFrame (metrics and threshold)

If curve is SKROC, the parameter nb is not taken into account. It should be set to 0.

confusion(score=None, nb=10, curve=CurveType.ROC, bootstrap=False)[source][source]#

Computes the confusion matrix for a specific score or all if score is None.

@param score score or None. @param nb number of scores (if score is None) @param curve see CurveType @param boostrap builds the curve after resampling @return One row if score is precised, many roww is score is None

plot(nb=100, curve=CurveType.ROC, bootstrap=0, ax=None, thresholds=False, **kwargs)[source][source]#

Plots a ROC curve.

@param nb number of points @param curve see CurveType @param bootstrap number of curves for the boostrap (0 for None) @param ax axis @param thresholds use thresholds for the X axis @param kwargs sent to pandas.plot @return ax


Computes the precision.


Resamples among the data.

@return DataFrame

roc_intersect(roc, x)[source][source]#

The ROC curve is defined by a set of points. This function interpolates those points to determine y for any x.

@param roc ROC curve @param x x @return y

roc_intersect_interval(x, nb, curve=CurveType.ROC, bootstrap=10, alpha=0.05)[source][source]#

Computes a confidence interval for the value returned by @see me roc_intersect.

@param x x @param nb number of curves to draw @param curve see CurveType @param bootstrap number of random estimations @param alpha confidence interval @return dictionary