Categories
Coding Data Science Python

Singleton Fails with Multiprocessing in Python

A singleton is a class designed to only permit a single instance. They have a bad reputation, but do have (limited) valid uses. Singletons present lots of headaches, and may throw errors when used with multiprocessing in Python. This article will explain why, and what you can do to work around it.

A Singleton in the Wild

Singleton usage is exceedingly rare in Python. I’ve been writing Python code for 5 years and never came across one until last week. Having never studied the singleton design pattern, I was perplexed by the convoluted logic. I was also frustrated by the errors that kept popping up when I was forced to use it.

The most frustrating aspect of using a singleton for me came when I tried to run some code in parallel with joblib. Inside the parallel processes, the singleton always acted like it hadn’t been instantiated yet. My parallel code only worked when I added another instantiation of the singleton inside the function called by the process. It took me a long time to figure out why.

Why Singletons Fail with Multiprocessing

The best explanation for why singletons throw errors with multiprocessing in Python is this answer from StackOverflow.

Each of your child processes runs its own instance of the Python interpreter, hence the singleton in one process doesn’t share its state with those in another process.

https://stackoverflow.com/questions/45077043/make-singleton-class-in-multiprocessing

Your singleton instance won’t be shared across processes.

Working Around Singleton Errors in Multiprocessing

There are several ways to work around this problem. Let’s start with a basic singleton class and see how a simple parallel process will fail.

In this example, the singleton needs to do an empty instantiation inside your worker function because we want access to some attribute stored in the singleton. We don’t know what value to instantiate it with because that’s the very thing we’re trying to access from the attribute.

Environment Variables

Here’s a simple solution I came up with that worked for me, and might for you as well. The solution here uses environment variables to store state across processes.

Pass Singleton as Argument

Another solution is to simply pass the instantiated singleton instance as an argument to the worker function.