cpu "cycles" don't really get wasted by inactive instances. They are inactive and aren't using any CPU core "cycles".
its not an exact science and depends so much on your hardware and your music too. I recommend you structure your instances the way that makes sense for you to work and don't obsess over trying to optimize their core use. VePro has really good thread processing, honestly, I mostly just leave my thread setting to maybe 80-90% of 2x the number of cores....and I never worry about it. If in a particular project you find you're having some issues, then you can play with the thread setting. It really depends so much on many factors and there is a lot of misunderstanding about the difference between "threads" and "cores".
The only thing is that if you have a low thread-per-instance setting, then if you create an instance with a lot of instrument channels that are all busy...that particular instance will be constrained in what it can accomplish with only a few threads, and if it can't keep up with the audio buffer, you'd experience audio dropouts from that instance.
I think its best to keep things simple. Organize your instances in a way that makes sense for you to work, then play with the thread setting if and when you have any cpu related drop outs.
Nerd Alert
This topic can get complicated. Truthfully, we customers don't know the exact internals of VePro how thread allocation is done, but keep in mind that "threads" does not equate exactly to "cores". "Cores" are a hardware component, of which there is a fixed number in your computer. At any given instant, those N number of cores are working separately, in parallel. Using CPU cycles, so to speak.
"Threads" are a software concept. When a program can do certain work in parallel to other work, they can create a kind of software job called a "thread". The program will create any number of threads, even more than the actual number of hardware cores, ...and then those various threads will take turns using the fixed number of hardware cores.
And at any given instance your computer has dozens of programs that are actually running, some of them with multiple threads too, which means you could have something like 50-100 total software threads happening across all the programs running on your system...and they are all taking turns using the fixed number of hardware cores. Some threads are idle! When they are idle they aren't using much CPU, but their mere existence means that the OS will have to take a little time periodically to see if they have work to do...so yea, there is a little bit overhead associated with each thread, even when its idle, that' why we don't just create a buzzillion threads. You want plenty of threads, but not so many that the overhead starts to harm performance.
So anyway back to your original question, idle threads are not so much of an issue, they are not using up much cpu cycles really.
The challenge is more about making sure that none of the N number of hardware cores are ever sitting around idle. You want to make sure that there plenty of threads between all the VePro instances, so that at any given instance, all N cores are working as hard as possible, none are sitting around waiting. If you have too few threads, then the situation could happen with a few peaked cores...and other cores doing not much. The peaked cores in that case will give audio drop outs.
But the problem here is if you have a busy instance that needs more than 3 threads to work efficiently, maybe because it has 10 channels in it (where each one of those channels could potentially be using its own thread to maximize the core usage). If you have the setting at 3, but one instance is using 10 channels, then those 10 channels have to take turns using 3 threads, which in turn are taking turns on the cores. If the music came to a place where that was the only instance making sound, it would definitely be under-utilizing the cores.
Each VePro instance is like a completely separate VePro server in a way. Its sharing all the cores, along with other programs too. So how many threads should each of those instances allocated such that it won't under utilize the cores...but also won't create too much overhead with too many threads. Too many active threads just get in their own way while they take turns using the cores.
It depends a lot on your music too. if you will always have a balanced playback where all (or at least several) of your instances are playing approximately the same number of channels...with notes turning into sound, always approximately the same time, then yes a low thread count would be more ideal. However, if you tend to have music which might hit one instance hard for a while, then another instance hard for a while, etc..then you'd want a larger thread setting. it just depends on you and your music and how many cores you have.
I personally think it makes more sense to just organize your instances the way that makes sense for your work flow and don't obsess too much about the threading stuff. Play around with the thread setting if you like, might need to be different for different projects because of the way the music will be arranged will change how the instances use it.
If that is too much to worry about, then that is why I prefer generally one of two scenarios, either one super huge instance with all my instruments in one instance and thread setting as high as possible.
or.. One instrument per instance, with a thread setting of 2.
when you go to the in-between instance sizing, that's when the answer is a big fat "it depends".