Understanding Multithreading

simon · June 26, 2019, 12:33pm

So I read this thread:

And got the impression that many people don’t understand how (and if) Gig Performer can use multiple cores.

I wasn’t understanding it either. In my experience, GP performs really well, but since this seems to be a topic of interest, so let’s demystify it! I did some research on the forum and came up with some assumptions and further questions that I’m interested to hear your opinion about (especially @djogon and @dhj)

Assumption 1: Only the plugins in the currently active rackspace consume CPU cycles

Was confirmed here:

Assumption 2: Gig Performer launches all plugins in the same thread

All the blocks in connection view (= plugins) are launched in the same thread by GP. As a single thread cannot be run on multiple cores at once, you get a CPU-intensive thread which cannot use more than a single core.

Assumption 3: Plugins can launch their own threads to make use of multiple cores

So plugins can make use of multiple cores on their own, independent of GP. I checked for a few popular virtual instruments, if they actually do that:

Kontakt

Screenshot of settings dialog

kontakt-multicore595×332 25.5 KB
Reaktor Source
Diva

Screenshot of main window
Pianoteq (limited to two cores) Source
Omnisphere Source (despite being a known CPU hog )

So that means, plugins like Kontakt can easily distribute their loads over multiple cores and do not stress the “main GP audio thread” very much, whereas in Omnisphere, whether you’re using multiple sounds in a single instance or multiple instances with one sound each, they all run in a single thread.

Assumption 4: GP could run different plugins in different threads

I understand @djogon’s argument concerning parallel processing:

But I have counter-argument: All “classic” DAWs I checked (Ableton, Logic + Mainstage, Reason (with some troubles), Reaper, Studio One) distribute plugins over multiple (DAW-owned) thread. Although the plugins of a single channel cannot be spread across multiple threads, they seem to be able to distribute the plugins on a per-channel-basis.
The equivalent of two independent channels in a classic DAW in GP would be the following routing:

Currently, all the processing here would need to have to be done in a single thread, right?
What are the reasons that Gig Performer can not assign the left and right path to different threads?
Is this a limitation given by the principles of GP? Or by the underlying framework? Or would it be possible, but just not implemented yet?

Looking forward to your thoughts on this!

dhj · June 26, 2019, 1:11pm

Sure - if you use a channel strip model, it’s not too hard to do this since in principle you are forced to have independent chains, although you will run into some sync issues and consequently some slowdown as soon as you have Send busses. As an analogy, suppose you’re in photoshop and you run a command to change all pixels to RED. You could easily use multiple cores to do this since the color of each pixel is independent of anything else going on. But if you try to do something such as “change all pixels to the average of the brightness of the pixels surrounding them”, then you can’t do parallel like that since you don’t know what color your new pixel will be until you’ve finished calculating the brightness of the surrounding colors. Therefore, you have to wait and doing stuff in parallel doesn’t save you time.

But referring to your graph, the problem is, as soon as you do this (see red arrow),

you’re screwed because you have a dependency.

simon · June 26, 2019, 1:21pm

Makes sense. But for cases without such dependencies (which is the majority of my rackspaces at least) - wouldn’t a multi-threaded optimization be sensible?

I understand that this comes with a number of pitfalls, including the ones you described, and additionaly it is not transparent to the user which connections might force him back to single-core processing.

dhj · June 26, 2019, 2:04pm

I know it looks easy when you just look at it but parallelizing data flow and determining non-dependencies in a graph is a seriously non-trivial problem. You still have to deal with sync at the end and so if one sequence finishes first, it doesn’t buy you anything.

Incidentally, one thing that a channel strip system can do easily and automatically is turn off (bypass) its inserted plugins when the channel strip is muted because it is a separate path. While we can’t (easily) do that automatically for the same reason as described above, it’s really easy to write a script that will respond to a mute button and turn on or off plugins that you select. That’s a simple optimization.

simon · June 26, 2019, 2:22pm

So, to come back to the original question:

Your answer would be:

Because, for the general case, this involves solving several hard problems, does not guarantee big benefits, and introduces a whole new class of possible bugs. So it’s probably not worth it.
Other DAWs have concepts that avoids some of these problems.

Did I understand that correctly?

simon · June 26, 2019, 2:33pm

I’ve seen the scripts about automatic bypassing and I think, it is indeed a simple and very effective opimization.

But I actually often layer multiple Omisphere sounds, so I would not want to either mute or bypass any instance here.
However, I will look into other ways to get those sounds in GP:

Resampling the layered sound
Recreating the sounds in a simpler way or with other tools
Move some independent paths to another GP instance which is synced to the first
Kind of ugly, but also interesting: Loading another DAW as a plugin into GP, which then runs multiple plugin instances in multiple threads

Going back to Mainstage is not an option, in any case. I’m very happy with GPs overall approach and stability

dhj · June 26, 2019, 3:15pm

I think people are also forgetting that plugin processing, specifically the audio part, is not the only thing going on – there are lots of other threads both in Gig Performer (e.g., the entire GUI, GP Script processing, OSC processing, function generators among many other things) and in individual plugins, all doing work and the OS is allocating those threads to different cores.

xpansion · June 26, 2019, 9:13pm

It’s not uncommon to think that multi-threading is the answer to boosting performance, but if implemented in certain contexts, or implemented poorly in almost any context, it can and often does have the opposite effect.

dhj · June 26, 2019, 11:48pm

Yeah, the downside of hardware companies marketing 2-core, 8-core, 16-core, your system will run much faster, blah blah blah and there’s little understanding that it just doesn’t quite work like that as soon as there are dependencies.

It’s generally much better to let the OS handle such things (that’s what an OS is for!) and if a particular plugin knows how to vectorize some of its operations, great.

simon · June 30, 2019, 11:20am

For me, that was the whole point of this thread: Demystifying the while multithreading topic and GPS approach to it.
Thanks so much for your input, everyone!

manhippo · June 26, 2021, 10:12am

Hey, this thread is really helpful. thanks.

A follow up question: When you switch rackspaces, does GP make use of seperate cores?

I totally understand the need to keep everything on a single processor to allow total routing flexibility within a patch. However, as I understand it, when you switch between two racks, GP loads both so that it can crossfade between the two for a smooth transition. If that’s all running on one core, the potential for overload at the point of switching could be high, but if each rack is using a seperate core, it should be no higher than while running either of the individual racks. Would be awesome to know how GP is handling stuff in that instance.

pianopaul · June 26, 2021, 10:17am

Simple answer, no

Therefore I avoid CPU usage Shown in GP above 50%

And demanding Sounds I resample and use that in Kontakt.

Or you use a 2nd Instance of GP

manhippo · June 26, 2021, 10:31am

Ah ok. So the potential for overload if you are switching between two patches that run at 50%, say, is high?

May be me misunderstanding something technical and complicated, but wouldn’t that be quite a good way to make use of multiple cores?

Also, I totally see the logic of the above flow diagrams from dhj, but couldn’t one implement multicore support, for those who need it, by making a ‘core container’ or similar, Ie a container element that can have its own internal chain, and which makes use of a core other than the one being used by the main patch. it would a be a black box, so no possibility of mixing routing from withhin the container, only patching signal in and patching signal out, so a disicrete object, but it would give that flexibility to users who want to squeeze more out of their existing hardware, and are happy to trade off a little flexibility.

Alternatively, if that feels too open to problems, couldn’t GP detect the number of cores, and then offer ‘core lanes’ in it’s rackspace. Ie, say it’s 4 core processor, GP decides that one core is needed to run system stuff etc (I appreciate that the OS probably isn’t that neat, just spitballing here), so it sets up two or three core lanes. These are discrete routing spaces, where all the normal patching can take place, but nothing can be patched between them. Ie, two/three independant, parallel processing environments, leveraging the additional power available from the cores.

It’s not hugely relavant to me as I’m a guitarist and violinist, and I tend to keep my patches racks pretty light and efficent to allow for maximum processing headroom, and I’m not running synths or romplers or other resource munching vstis, but if i did decide I wanted to go to town on the creative effects, being able to run a channel for my guitar and a channel for my violin on two seperate cores might be a really useful usage instance, and the loss of flexibilty wouldn’t matter to me at all, as i can’t think of an instance in which I’d want to route audio between my two instruments, as even if I switch back and forth within a song, I literally can’t play both at the same time. I imagine if I was a keyboard player, the instances in which this would be useful would be many times greater.

pianopaul · June 26, 2021, 10:33am

Sure and maybe in Future Versions we get improvements.

manhippo · June 26, 2021, 11:01am

Oh for sure. I’m just suggesting a couple of topologies that might be interesting to the devs. Or maybe not. There may be lots of technical reasons why none of that works that I don’t understand.

pianopaul · June 26, 2021, 11:03am

Orher DAW Support multicore because of the strict channel design.
With free wiring this is much More complicated.
But like other companies like Oracle allways say: Performance is issue, buy better Hardware,

dhj · June 26, 2021, 11:05am

Have you actually had a problem due to this?

manhippo · June 26, 2021, 11:06am

Yes absolutely, as illustrated by dhj above. That’s why i’m suggesting creating core dedicated ‘channels’ that still allow for all of the currently available free routing within each channel. Best of both worlds.

pianopaul · June 26, 2021, 11:07am

That is a really demanding task for the developers.
When you have proofed ideas which can help, you Are very welcome.

manhippo · June 26, 2021, 11:16am

As I say above, no, but I also never switch rackspaces during a song for this reason, and as detailed above, I tailor my racks for efficiency so I have plenty of headroom. Running out of processor power is something we have all experienced in the digital audio world, so I’m already prearmed with an array of tactics for avoiding that scenario and I design sounds accordingly.

I have seen others on the forum expressing this concern though, mainly keyboard players running more intensive vsts. To be clear, this isn’t me complaining about GP (I hope that was crystal clear in my tone, but maybe it wasn’t), this is me reflecting on the above discussion about multi core use and proposing some ideas for future development that might be useful? After all, we can probably all agree that using more of one’s available computing power is a win for GP?