pyspark.sql.datasource.DataSourceStreamReader.initialOffset#

DataSourceStreamReader.initialOffset()[source]#

Return the initial offset of the streaming data source. A new streaming query starts reading data from the initial offset. If Spark is restarting an existing query, it will restart from the check-pointed offset rather than the initial one.

Returns
dict

A dict or recursive dict whose key and value are primitive types, which includes Integer, String and Boolean.

Examples

>>> def initialOffset(self):
...     return {"parititon-1": {"index": 3, "closed": True}, "partition-2": {"index": 5}}