当前位置：首页 > Python > 正文

Python collections.Counter教程 - 高效统计元素频次 | Python编程指南

RuanTongKan
Python
2025-08-03
1118

Python collections.Counter教程

高效统计元素出现次数的终极指南

什么是Counter？

collections.Counter 是Python内置的计数器工具，用于高效统计可哈希对象的出现次数。它是dict的子类，提供了一种快速、简洁的方式来计数元素。

Counter的主要优势：

自动处理不存在的键（返回0而不是KeyError）
提供丰富的统计方法（如most_common）
支持集合操作（并集、交集等）
比手动使用字典计数更高效

创建Counter对象

创建Counter的几种方式：

1. 从序列创建

from collections import Counter

# 从列表创建
words = ['apple', 'banana', 'apple', 'orange', 'banana', 'apple']
word_count = Counter(words)
print(word_count)  # Counter({'apple': 3, 'banana': 2, 'orange': 1})

2. 从字典创建

# 从字典创建
count_data = {'red': 5, 'blue': 12, 'green': 7}
color_counter = Counter(count_data)
print(color_counter)  # Counter({'blue': 12, 'green': 7, 'red': 5})

3. 使用关键字参数创建

# 使用关键字参数
c = Counter(cats=4, dogs=8)
print(c)  # Counter({'dogs': 8, 'cats': 4})

更新Counter对象

可以使用update()方法添加新数据：

# 初始化Counter
inventory = Counter(apple=5, orange=3, banana=2)

# 添加新数据
new_stock = ['apple', 'apple', 'banana', 'kiwi', 'kiwi']
inventory.update(new_stock)

print(inventory)  
# Counter({'apple': 7, 'banana': 3, 'orange': 3, 'kiwi': 2})

访问计数数据

访问Counter中元素计数的方式：

c = Counter(['a', 'b', 'c', 'a', 'b', 'a'])

# 访问存在的元素
print(c['a'])  # 3

# 访问不存在的元素（不会引发KeyError）
print(c['d'])  # 0

常用方法

most_common([n])

返回出现次数最多的n个元素及其计数

letters = Counter('abracadabra')
print(letters.most_common(3))
# [('a', 5), ('b', 2), ('r', 2)]

elements()

返回一个迭代器，包含所有元素（按计数重复）

c = Counter(a=3, b=2, c=1)
print(list(c.elements()))
# ['a', 'a', 'a', 'b', 'b', 'c']

subtract()

从计数中减去元素

c = Counter(a=4, b=2, c=0, d=-2)
d = Counter(a=1, b=2, c=3, d=4)
c.subtract(d)
print(c)  # Counter({'a': 3, 'b': 0, 'c': -3, 'd': -6})

数学运算

Counter支持数学集合操作：

c1 = Counter(a=3, b=1)
c2 = Counter(a=1, b=2)

# 加法
print(c1 + c2)  # Counter({'a': 4, 'b': 3})

# 减法（只保留正计数）
print(c1 - c2)  # Counter({'a': 2})

# 交集（最小计数）
print(c1 & c2)  # Counter({'a': 1, 'b': 1})

# 并集（最大计数）
print(c1 | c2)  # Counter({'a': 3, 'b': 2})

实际应用示例

1. 文本分析：统计词频

text = "Python是一种广泛使用的编程语言，Python以其简洁的语法和强大的功能而闻名"
words = text.split()

# 统计词频（忽略大小写）
word_count = Counter(word.lower().strip('.,') for word in words)

# 获取最常见的3个词
print(word_count.most_common(3))
# [('python', 2), ('是一种广泛使用的编程语言', 1), ('以其简洁的语法和强大的功能而闻名', 1)]

2. 数据验证：检查重复项

def find_duplicates(items):
    counter = Counter(items)
    return [item for item, count in counter.items() if count > 1]

data = [1, 2, 3, 4, 2, 5, 3, 6, 3]
print(find_duplicates(data))  # [2, 3]

3. 库存管理

# 初始库存
inventory = Counter(apples=15, oranges=20, bananas=10)

# 销售记录
sales = ['apples', 'apples', 'oranges', 'bananas', 'apples']

# 更新库存
inventory.subtract(sales)

# 打印当前库存
print("当前库存:")
for item, count in inventory.items():
    if count > 0:
        print(f"{item}: {count}")

# 输出:
# 当前库存:
# apples: 12
# oranges: 19
# bananas: 9

Counter使用技巧

使用Counter代替手动字典计数更简洁高效
结合most_common()快速获取排名数据
通过数学运算可以轻松比较数据集
处理不存在的键时不会抛出KeyError
使用elements()方法重建原始序列
Counter对象可以像字典一样进行JSON序列化

总结

collections.Counter是Python中一个强大的计数工具，特别适用于需要统计元素出现频次的场景。相比手动实现计数器，Counter提供了更简洁的语法和更丰富的功能。

关键要点：

可以轻松地从各种数据结构创建Counter
支持动态更新计数
提供most_common()等实用方法
支持集合运算
在文本处理、数据分析和库存管理等场景非常有用

掌握Counter的使用可以大大提高处理计数任务的效率和代码可读性。

本文由RuanTongKan于2025-08-03发表在吾爱品聚，如有疑问，请联系我们。
本文链接：https://521pj.cn/20257202.html

Python collections.Counter教程 - 高效统计元素频次 | Python编程指南

Python collections.Counter教程

什么是Counter？

创建Counter对象

1. 从序列创建

2. 从字典创建

3. 使用关键字参数创建

更新Counter对象

访问计数数据

常用方法

most_common([n])

elements()

subtract()

数学运算

实际应用示例

1. 文本分析：统计词频

2. 数据验证：检查重复项

3. 库存管理

Counter使用技巧

总结

Python操作Excel文件完全指南 | 数据处理教程

电磁弹射攻坚，福建舰的深蓝征程

发表评论取消回复

Python collections.Counter教程 - 高效统计元素频次 | Python编程指南

什么是Counter？

创建Counter对象

1. 从序列创建

2. 从字典创建

3. 使用关键字参数创建

更新Counter对象

访问计数数据

常用方法

most_common([n])

elements()

subtract()

数学运算

实际应用示例

1. 文本分析：统计词频

2. 数据验证：检查重复项

3. 库存管理

Counter使用技巧

总结

Python操作Excel文件完全指南 | 数据处理教程

电磁弹射攻坚，福建舰的深蓝征程

相关文章

发表评论取消回复