pyspark.sql.functions.element_at#

pyspark.sql.functions.element_at(col, extraction)[source]#

Collection function: (array, index) - Returns element of array at given (1-based) index. If Index is 0, Spark will throw an error. If index < 0, accesses elements from the last to the first. If ‘spark.sql.ansi.enabled’ is set to true, an exception will be thrown if the index is out of array boundaries instead of returning NULL.

(map, key) - Returns value for given key in extraction if col is map. The function always returns NULL if the key is not contained in the map.

New in version 2.4.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
colColumn or str

name of column containing array or map

extraction

index to check for in array or key to check for in map

Returns
Column

value at given position.

See also

get()

Notes

The position is not zero based, but 1 based index.

Examples

Example 1: Getting the first element of an array

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([(["a", "b", "c"],)], ['data'])
>>> df.select(sf.element_at(df.data, 1)).show()
+-------------------+
|element_at(data, 1)|
+-------------------+
|                  a|
+-------------------+

Example 2: Getting the last element of an array using negative index

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([(["a", "b", "c"],)], ['data'])
>>> df.select(sf.element_at(df.data, -1)).show()
+--------------------+
|element_at(data, -1)|
+--------------------+
|                   c|
+--------------------+

Example 3: Getting a value from a map using a key

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([({"a": 1.0, "b": 2.0},)], ['data'])
>>> df.select(sf.element_at(df.data, sf.lit("a"))).show()
+-------------------+
|element_at(data, a)|
+-------------------+
|                1.0|
+-------------------+

Example 4: Getting a non-existing value from a map using a key

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([({"a": 1.0, "b": 2.0},)], ['data'])
>>> df.select(sf.element_at(df.data, sf.lit("c"))).show()
+-------------------+
|element_at(data, c)|
+-------------------+
|               NULL|
+-------------------+