A singleton is a class designed to only permit a single instance. They have a bad reputation, but do have (limited) valid uses. Singletons present lots of headaches, and may throw errors when used with multiprocessing in Python. This article will explain why, and what you can do to work around it.
A Singleton in the Wild
Singleton usage is exceedingly rare in Python. I’ve been writing Python code for 5 years and never came across one until last week. Having never studied the singleton design pattern, I was perplexed by the convoluted logic. I was also frustrated by the errors that kept popping up when I was forced to use it.
The most frustrating aspect of using a singleton for me came when I tried to run some code in parallel with joblib
. Inside the parallel processes, the singleton always acted like it hadn’t been instantiated yet. My parallel code only worked when I added another instantiation of the singleton inside the function called by the process. It took me a long time to figure out why.
Why Singletons Fail with Multiprocessing
The best explanation for why singletons throw errors with multiprocessing in Python is this answer from StackOverflow.
Each of your child processes runs its own instance of the Python interpreter, hence the singleton in one process doesn’t share its state with those in another process.
https://stackoverflow.com/questions/45077043/make-singleton-class-in-multiprocessing
Your singleton instance won’t be shared across processes.
Working Around Singleton Errors in Multiprocessing
There are several ways to work around this problem. Let’s start with a basic singleton class and see how a simple parallel process will fail.
import time | |
from joblib import Parallel, delayed | |
class OnlyOne: | |
"""Singleton Class, inspired by | |
https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Singleton.html""" | |
class __OnlyOne: | |
def __init__(self, arg): | |
if arg is None: | |
raise ValueError("Pretend empty instantiation breaks code") | |
self.val = arg | |
def __str__(self): | |
return repr(self) + self.val | |
instance = None | |
def __init__(self, arg=None): | |
if not self.instance: | |
self.instance = self.__OnlyOne(arg) | |
else: | |
self.instance.val = arg | |
def __getattr__(self, name): | |
return getattr(self.instance, name) | |
def worker(num): | |
"""Single worker function to run in parallel. | |
Assume that this function has to do an empty | |
instantiation of the singleton. | |
""" | |
one = OnlyOne() | |
time.sleep(0.1) | |
one.val += num | |
return one.val | |
# Instantiate singleton | |
one = OnlyOne(0) | |
print(one.val) | |
# Try to run in parallel | |
# Will hit the ValueError that raises with | |
# empty instantiation | |
res = Parallel(n_jobs=-1, verbose=10)( | |
delayed(worker)(i) for i in range(10) | |
) | |
print(res) |
In this example, the singleton needs to do an empty instantiation inside your worker function because we want access to some attribute stored in the singleton. We don’t know what value to instantiate it with because that’s the very thing we’re trying to access from the attribute.
Environment Variables
Here’s a simple solution I came up with that worked for me, and might for you as well. The solution here uses environment variables to store state across processes.
import time | |
from joblib import Parallel, delayed | |
import os | |
class OnlyOne: | |
"""Singleton Class, inspired by | |
https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Singleton.html | |
Modified to work with parallel processes using environment | |
variables to store state across processes. | |
""" | |
class __OnlyOne: | |
def __init__(self, arg): | |
if arg is None: | |
raise ValueError("Pretend empty instantiation breaks code") | |
self.val = arg | |
def __str__(self): | |
return repr(self) + self.val | |
instance = None | |
def __init__(self, arg=None): | |
if not self.instance: | |
if arg is None: | |
# look up val from env var | |
arg = os.getenv('SINGLETON_VAL') | |
else: | |
# set env var so all workers use the same val | |
os.environ['SINGLETON_VAL'] = arg | |
self.instance = self.__OnlyOne(arg) | |
else: | |
self.instance.val = arg | |
def __getattr__(self, name): | |
return getattr(self.instance, name) | |
def worker(num): | |
"""Single worker function to run in parallel. | |
Assume that this function has to do an empty | |
instantiation of the singleton. | |
""" | |
one = OnlyOne() | |
time.sleep(0.1) | |
one.val += num | |
return one.val | |
# Instantiate singleton | |
one = OnlyOne(0) | |
print(one.val) | |
# Run in parallel worry-free | |
res = Parallel(n_jobs=-1, verbose=10)( | |
delayed(worker)(i) for i in range(10) | |
) | |
print(res) |
Pass Singleton as Argument
Another solution is to simply pass the instantiated singleton instance as an argument to the worker function.
import time | |
from joblib import Parallel, delayed | |
class OnlyOne: | |
"""Singleton Class, inspired by | |
https://python-3-patterns-idioms-test.readthedocs.io/en/latest/Singleton.html""" | |
class __OnlyOne: | |
def __init__(self, arg): | |
if arg is None: | |
raise ValueError("Pretend empty instantiation breaks code") | |
self.val = arg | |
def __str__(self): | |
return repr(self) + self.val | |
instance = None | |
def __init__(self, arg=None): | |
if not self.instance: | |
self.instance = self.__OnlyOne(arg) | |
else: | |
self.instance.val = arg | |
def __getattr__(self, name): | |
return getattr(self.instance, name) | |
def worker(num, one): | |
"""Single worker function to run in parallel. | |
""" | |
time.sleep(0.1) | |
one.val += num | |
return one.val | |
# Instantiate singleton | |
one = OnlyOne(0) | |
print(one.val) | |
# Run in parallel succeeds when one is passed | |
# as arg to worker | |
res = Parallel(n_jobs=-1, verbose=10)( | |
delayed(worker)(i, one) for i in range(10) | |
) | |
print(res) |