pandas.core.groupby.DataFrameGroupBy.corr #

DataFrameGroupBy。corr ( method = 'pearson' , min_periods = 1 , numeric_only = False ) [来源] #

计算列的成对相关性，不包括 NA/null 值。

参数：

方法{'pearson', 'kendall', 'spearman'} 或可调用

相关法：

皮尔逊：标准相关系数
kendall ：Kendall Tau 相关系数
Spearman ：斯皮尔曼等级相关
callable：可调用，输入两个 1d ndarray
并返回一个浮点数。请注意，从 corr 返回的矩阵沿对角线的值为 1，并且无论可调用对象的行为如何，都将是对称的。

min_periods int，可选

为了获得有效结果，每对列所需的最小观察次数。目前仅适用于 Pearson 和 Spearman 相关。

numeric_only布尔值，默认 False

仅包含float、int或boolean数据。

1.5.0 版本中的新增内容。

版本 2.0.0 中更改：默认值为numeric_onlynow False。

返回：

数据框: 相关矩阵。

也可以看看

DataFrame.corrwith: 计算与另一个 DataFrame 或 Series 的成对相关性。
Series.corr: 计算两个系列之间的相关性。

笔记

目前，Pearson、Kendall 和 Spearman 相关性是使用成对完整观测值计算的。

例子

>>> def histogram_intersection(a, b):
...     v = np.minimum(a, b).sum().round(decimals=1)
...     return v
>>> df = pd.DataFrame([(.2, .3), (.0, .6), (.6, .0), (.2, .1)],
...                   columns=['dogs', 'cats'])
>>> df.corr(method=histogram_intersection)
      dogs  cats
dogs   1.0   0.3
cats   0.3   1.0

>>> df = pd.DataFrame([(1, 1), (2, np.nan), (np.nan, 3), (4, 4)],
...                   columns=['dogs', 'cats'])
>>> df.corr(min_periods=3)
      dogs  cats
dogs   1.0   NaN
cats   NaN   1.0