pyspark.pandas.groupby.GroupBy.size¶
-
GroupBy.
size
() → pyspark.pandas.series.Series[source]¶ Compute group sizes.
Examples
>>> df = ps.DataFrame({'A': [1, 2, 2, 3, 3, 3], ... 'B': [1, 1, 2, 3, 3, 3]}, ... columns=['A', 'B']) >>> df A B 0 1 1 1 2 1 2 2 2 3 3 3 4 3 3 5 3 3
>>> df.groupby('A').size().sort_index() A 1 1 2 2 3 3 dtype: int64
>>> df.groupby(['A', 'B']).size().sort_index() A B 1 1 1 2 1 1 2 1 3 3 3 dtype: int64
For Series,
>>> df.B.groupby(df.A).size().sort_index() A 1 1 2 2 3 3 Name: B, dtype: int64
>>> df.groupby(df.A).B.size().sort_index() A 1 1 2 2 3 3 Name: B, dtype: int64