python – 用groupby替换值意味着

我有一个DataFrame,其列包含一些带有各种负值的错误数据.我想替换值< 0表示他们所在组的平均值. 对于作为NA的缺失值,我会这样做:

data = df.groupby(['GroupID']).column
data.transform(lambda x: x.fillna(x.mean()))

但是如何在x

使用@ AndyHayden的示例,您可以使用groupby / transform和replace:

df = pd.DataFrame([[1,1],[1,-1],[2,1],[2,2]], columns=list('ab'))
print(df)
#    a  b
# 0  1  1
# 1  1 -1
# 2  2  1
# 3  2  2

data = df.groupby(['a'])
def replace(group):
    mask = group<0
    # Select those values where it is < 0, and replace
    # them with the mean of the values which are not < 0.
    group[mask] = group[~mask].mean()
    return group
print(data.transform(replace))
#    b
# 0  1
# 1  1
# 2  1
# 3  2
翻译自:https://stackoverflow.com/questions/14760757/replacing-values-with-groupby-means

转载注明原文:python – 用groupby替换值意味着

使用@ AndyHayden的示例,您可以使用groupby / transform和replace:

df = pd.DataFrame([[1,1],[1,-1],[2,1],[2,2]], columns=list('ab'))
print(df)
#    a  b
# 0  1  1
# 1  1 -1
# 2  2  1
# 3  2  2

data = df.groupby(['a'])
def replace(group):
    mask = group<0
    # Select those values where it is < 0, and replace
    # them with the mean of the values which are not < 0.
    group[mask] = group[~mask].mean()
    return group
print(data.transform(replace))
#    b
# 0  1
# 1  1
# 2  1
# 3  2
相关文章
相关标签/搜索