Skip to content

Use pyspark.InheritableThread as the base thread class for threadpool #32

Description

@NivekNey

Context

InheritableThread is a thread class that pyspark provides to support sync between JVM and python threads.

The ThreadPool class used in this package is python's original multiprocessing.pool.ThreadPool.

What goods/bads will it bring? Dunno, I'll start looking around.

Proposal

Without touching this package's source code, a user can already nicely make their own ThreadPool that uses their own Thread class. If this package were to adopt the pyspark Thread class, then the steps may be:

  1. Create a custom DummyProcess that inherits pyspark.InheritableThread.
  2. Create a custom ThreadPool that uses the custom DummyProcess.
  3. Use the custom ThreadPool in this package's SparkDistributedBackend._get_pool method.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions