There is no access synchronization in Nuklei methods. An object that is modified by one thread should therefore never be accessed by a concurrent thread.
However, Nuklei can be used in multiple threads, as long as objects are not read-modified by multiple threads concurrently.
If you are using multiple threads to decrease computation time, use OpenMP, not pthreads. Using pthreads will not significantly speed up your program.
Nuklei classes do not protect their data members from concurrent access. As a result, an object that is modified by one thread should never be accessed by a concurrent thread.
Nuklei does make use of global variables, to reference two program-wide random number generators (see the Random class and Random.cpp). Nuklei serializes access to these generators via mutexes. Therefore, it is always safe to run Nuklei code in concurrent threads (provided that the data access constraint above is respected).
Unfortunately, random numbers are called a lot in Nuklei applications, and protecting the generators with mutexes slows multithreaded applications dramatically (on a 16-core processor, for a typical Nuklei application, running 16 threads concurrently takes five times longer than running only one of those threads alone). For this reason, Random.cpp sets up a vector of random generators, and it allows OpenMP threads to each access a different generator. The index of the generator that a thread uses is given by omp_get_thread_num()
. The vector of generators is initialized when the program starts, before main()
is called, and its size is set to the value returned by omp_get_max_threads()
. A vector of mutexes still serializes access to these generators. These mutexes should however never block in OpenMP threads, as long as there is only a single pool of threads active at any time, or, in other words, as long as no two threads has the same omp_get_thread_num()
. Naturally, the pool size should not be increased during the course of the program. Posix threads will all use the same generator, and wait at its mutex if necessary. As a result, pthread MT will be much slower than OpenMP MT.
The class parallelizer implements several parallelization schemes including OpenMP, pthread, and fork.
Contact me if you want to discuss this issue further. I've been working on multithreading Nuklei for a while and I might be able to help you.
For instance, what if I get the following error:
Short answer: Try normalizing all the quaternions you give to Nuklei:
Long answer:
Nuklei expects that the data it receives is well-formed to an error of 1e-6. Providing Nuklei with a unit quaternion, or unit vector, whose norm is larger/smaller than 1 by more than 1e-6 will trigger exceptions at various places in the code.
When operating on quaternions, their value may drift away from normality. At places where it can afford it, Nuklei performs re-normalization of its data. Where it cannot, it checks that the data is ok. If not, it quits. Even though this behavior can appear quite strict, it often helps finding bugs. I have thus decided to leave it as is for the moment.