Understanding Multithreading

So I read this thread:

And got the impression that many people don’t understand how (and if) Gig Performer can use multiple cores.

I wasn’t understanding it either. In my experience, GP performs really well, but since this seems to be a topic of interest, so let’s demystify it! I did some research on the forum and came up with some assumptions and further questions that I’m interested to hear your opinion about (especially @djogon and @dhj)

Assumption 1: Only the plugins in the currently active rackspace consume CPU cycles

Was confirmed here:

Assumption 2: Gig Performer launches all plugins in the same thread

All the blocks in connection view (= plugins) are launched in the same thread by GP. As a single thread cannot be run on multiple cores at once, you get a CPU-intensive thread which cannot use more than a single core.

Assumption 3: Plugins can launch their own threads to make use of multiple cores

So plugins can make use of multiple cores on their own, independent of GP. I checked for a few popular virtual instruments, if they actually do that:

So that means, plugins like Kontakt can easily distribute their loads over multiple cores and do not stress the “main GP audio thread” very much, whereas in Omnisphere, whether you’re using multiple sounds in a single instance or multiple instances with one sound each, they all run in a single thread.

Assumption 4: GP could run different plugins in different threads

I understand @djogon’s argument concerning parallel processing:

But I have counter-argument: All “classic” DAWs I checked (Ableton, Logic + Mainstage, Reason (with some troubles), Reaper, Studio One) distribute plugins over multiple (DAW-owned) thread. Although the plugins of a single channel cannot be spread across multiple threads, they seem to be able to distribute the plugins on a per-channel-basis.
The equivalent of two independent channels in a classic DAW in GP would be the following routing:

Currently, all the processing here would need to have to be done in a single thread, right?
What are the reasons that Gig Performer can not assign the left and right path to different threads?
Is this a limitation given by the principles of GP? Or by the underlying framework? Or would it be possible, but just not implemented yet?

Looking forward to your thoughts on this! :slight_smile:

Sure - if you use a channel strip model, it’s not too hard to do this since in principle you are forced to have independent chains, although you will run into some sync issues and consequently some slowdown as soon as you have Send busses. As an analogy, suppose you’re in photoshop and you run a command to change all pixels to RED. You could easily use multiple cores to do this since the color of each pixel is independent of anything else going on. But if you try to do something such as “change all pixels to the average of the brightness of the pixels surrounding them”, then you can’t do parallel like that since you don’t know what color your new pixel will be until you’ve finished calculating the brightness of the surrounding colors. Therefore, you have to wait and doing stuff in parallel doesn’t save you time.

But referring to your graph, the problem is, as soon as you do this (see red arrow),

you’re screwed because you have a dependency.

Makes sense. But for cases without such dependencies (which is the majority of my rackspaces at least) - wouldn’t a multi-threaded optimization be sensible?

I understand that this comes with a number of pitfalls, including the ones you described, and additionaly it is not transparent to the user which connections might force him back to single-core processing.

I know it looks easy when you just look at it but parallelizing data flow and determining non-dependencies in a graph is a seriously non-trivial problem. You still have to deal with sync at the end and so if one sequence finishes first, it doesn’t buy you anything.

Incidentally, one thing that a channel strip system can do easily and automatically is turn off (bypass) its inserted plugins when the channel strip is muted because it is a separate path. While we can’t (easily) do that automatically for the same reason as described above, it’s really easy to write a script that will respond to a mute button and turn on or off plugins that you select. That’s a simple optimization.

So, to come back to the original question:

Your answer would be:

Because, for the general case, this involves solving several hard problems, does not guarantee big benefits, and introduces a whole new class of possible bugs. So it’s probably not worth it.
Other DAWs have concepts that avoids some of these problems.

Did I understand that correctly?

I’ve seen the scripts about automatic bypassing and I think, it is indeed a simple and very effective opimization.

But I actually often layer multiple Omisphere sounds, so I would not want to either mute or bypass any instance here.
However, I will look into other ways to get those sounds in GP:

  • Resampling the layered sound
  • Recreating the sounds in a simpler way or with other tools
  • Move some independent paths to another GP instance which is synced to the first
  • Kind of ugly, but also interesting: Loading another DAW as a plugin into GP, which then runs multiple plugin instances in multiple threads

Going back to Mainstage is not an option, in any case. I’m very happy with GPs overall approach and stability :smiley:

I think people are also forgetting that plugin processing, specifically the audio part, is not the only thing going on – there are lots of other threads both in Gig Performer (e.g., the entire GUI, GP Script processing, OSC processing, function generators among many other things) and in individual plugins, all doing work and the OS is allocating those threads to different cores.

It’s not uncommon to think that multi-threading is the answer to boosting performance, but if implemented in certain contexts, or implemented poorly in almost any context, it can and often does have the opposite effect.

1 Like

Yeah, the downside of hardware companies marketing 2-core, 8-core, 16-core, your system will run much faster, blah blah blah and there’s little understanding that it just doesn’t quite work like that as soon as there are dependencies.

It’s generally much better to let the OS handle such things (that’s what an OS is for!) and if a particular plugin knows how to vectorize some of its operations, great.

For me, that was the whole point of this thread: Demystifying the while multithreading topic and GPS approach to it.
Thanks so much for your input, everyone!