Numpy et tableau de contingence#

Un exercice classique : écrire le calcul du \chi_2 d’un tableau de contingence sans écrire explicitement une boucle. numpy s’en chargera. A suivre jusqu’à ce que vous n’en ayez plus besoin.

[46]:
import numpy as np

A = np.array([[0, 1, 2, 3], [4, 5, 6, 7]], dtype=float)
A
[46]:
array([[0., 1., 2., 3.],
       [4., 5., 6., 7.]])
[48]:
A.sum(axis=1, keepdims=1)
[48]:
array([[ 6.],
       [22.]])
[49]:
A + A.sum(axis=1, keepdims=1)
[49]:
array([[ 6.,  7.,  8.,  9.],
       [26., 27., 28., 29.]])
[45]:
A.sum(axis=0, keepdims=1)
[45]:
array([[ 4,  6,  8, 10]])
[54]:
B = np.zeros(A.shape, dtype=A.dtype)
N2 = A.sum() ** 2
L = A.sum(axis=1)
C = A.sum(axis=0)
for i in range(A.shape[0]):
    for j in range(A.shape[1]):
        B[i, j] = A[i, j] - L[i] * C[j] / N2
B
[54]:
array([[-0.03061224,  0.95408163,  1.93877551,  2.92346939],
       [ 3.8877551 ,  4.83163265,  5.7755102 ,  6.71938776]])
[56]:
A - A.sum(axis=1, keepdims=1) * A.sum(axis=0, keepdims=1) / A.sum() ** 2
[56]:
array([[-0.03061224,  0.95408163,  1.93877551,  2.92346939],
       [ 3.8877551 ,  4.83163265,  5.7755102 ,  6.71938776]])
[57]:
L = A.sum(axis=1, keepdims=1)
C = A.sum(axis=0, keepdims=1)
L.shape, C.shape
[57]:
((2, 1), (1, 4))
[58]:
L * C
[58]:
array([[ 24.,  36.,  48.,  60.],
       [ 88., 132., 176., 220.]])
[59]:
C * L
[59]:
array([[ 24.,  36.,  48.,  60.],
       [ 88., 132., 176., 220.]])
[61]:
L @ C
[61]:
array([[ 24.,  36.,  48.,  60.],
       [ 88., 132., 176., 220.]])

Notebook on github