pyspark.sql.Column.isin#
- Column.isin(*cols)[source]#
A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.
New in version 1.5.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- colsAny
The values to compare with the column values. The result will only be true at a location if any value matches in the Column.
- Returns
Column
Column of booleans showing whether each element in the Column is contained in cols.
Examples
>>> df = spark.createDataFrame([(2, "Alice"), (5, "Bob"), (8, "Mike")], ["age", "name"])
Example 1: Filter rows with names in the specified values
>>> df[df.name.isin("Bob", "Mike")].show() +---+----+ |age|name| +---+----+ | 5| Bob| | 8|Mike| +---+----+
Example 2: Filter rows with ages in the specified list
>>> df[df.age.isin([1, 2, 3])].show() +---+-----+ |age| name| +---+-----+ | 2|Alice| +---+-----+
Example 3: Filter rows with names not in the specified values
>>> df[~df.name.isin("Alice", "Bob")].show() +---+----+ |age|name| +---+----+ | 8|Mike| +---+----+