pyspark.sql.DataFrame.printSchema#

DataFrame.printSchema(level=None)[source]#

Prints out the schema in the tree format. Optionally allows to specify how many levels to print if schema is nested.

New in version 1.3.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
levelint, optional

How many levels to print for nested schemas.

New in version 3.5.0.

Examples

Example 1: Printing the schema of a DataFrame with basic columns

>>> df = spark.createDataFrame(
...     [(14, "Tom"), (23, "Alice"), (16, "Bob")], ["age", "name"])
>>> df.printSchema()
root
 |-- age: long (nullable = true)
 |-- name: string (nullable = true)

Example 2: Printing the schema with a specified level for nested columns

>>> df = spark.createDataFrame([(1, (2, 2))], ["a", "b"])
>>> df.printSchema(1)
root
 |-- a: long (nullable = true)
 |-- b: struct (nullable = true)

Example 3: Printing the schema with deeper nesting level

>>> df.printSchema(2)
root
 |-- a: long (nullable = true)
 |-- b: struct (nullable = true)
 |    |-- _1: long (nullable = true)
 |    |-- _2: long (nullable = true)

Example 4: Printing the schema of a DataFrame with nullable and non-nullable columns

>>> df = spark.range(1).selectExpr("id AS nonnullable", "NULL AS nullable")
>>> df.printSchema()
root
 |-- nonnullable: long (nullable = false)
 |-- nullable: void (nullable = true)