pyspark.sql.Column.isin#

Column.isin(*cols)[source]#

A boolean expression that is evaluated to true if the value of this expression is contained by the evaluated values of the arguments.

New in version 1.5.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
colsAny

The values to compare with the column values. The result will only be true at a location if any value matches in the Column.

Returns
Column

Column of booleans showing whether each element in the Column is contained in cols.

Examples

>>> df = spark.createDataFrame([(2, "Alice"), (5, "Bob"), (8, "Mike")], ["age", "name"])

Example 1: Filter rows with names in the specified values

>>> df[df.name.isin("Bob", "Mike")].show()
+---+----+
|age|name|
+---+----+
|  5| Bob|
|  8|Mike|
+---+----+

Example 2: Filter rows with ages in the specified list

>>> df[df.age.isin([1, 2, 3])].show()
+---+-----+
|age| name|
+---+-----+
|  2|Alice|
+---+-----+

Example 3: Filter rows with names not in the specified values

>>> df[~df.name.isin("Alice", "Bob")].show()
+---+----+
|age|name|
+---+----+
|  8|Mike|
+---+----+