ok, let me try to explain 😊
I have to admit that I haven't used VSL yet, I only have experience with Synthogy's Ivory, but I guess the way it basically works is the same.
My understand how it currently works is as follows: When the engine initially starts, it reads the first part of each sample (of each sampleset the user has selected) into RAM, let's say the first 10ms. This amounts to quite some GBs. Additionally, the engine allocates a buffer for each polyphony level. These buffers can hold a much longer period, let's say 100ms. These buffers are cheap, because you only need very few of them compared to the amount of buffers you need for the first 10ms of each sample.
The moment a (MIDI) event arrives, the engine can start playing the sample right away, because it has 10ms buffered in RAM. It starts playing from this buffer. Simultaneously it allocates one of the 100ms buffers and directs the HDD driver to fill it with the sample data from 10ms-110ms. After the first 10ms have been played, hopefully enough data has arrived from HDD to continue from the larger buffer. This buffer will constantly be refilled from HDD until the sample ends or playback has been terminated. Afterwards the 100ms buffer will be released. If the sample is being played again, the data starting from 10ms will have to be fetched from HDD again. The 10ms are chosen in a way the HDD has enough time to respond.
Now the same in my 3 layer model:
At installation time, we copy the first 10ms of each installed sample to SSD. This data will stay there until the sampleset will be deinstalled.
At engine startup, the engine reads the first 1ms of each sample into RAM. Additionally it allocates the 100ms buffers as above.
The moment a (MIDI) event arrives, the engine can start playing the sample right away, because it has 1ms buffered in RAM. It starts playing from this buffer. Simultaneously it allocates one of the 100ms buffers and directs the SDD driver to fill the first 9ms of the buffer with the sample data from 1ms to 10ms. Also, it direct the HDD driver to fill it with the sample data from 10ms-101ms. After the first ms have been played, hopefully enough data has arrived from SDD to continue from the larger buffer. After 10ms the data from HDD will have arrived so that there will be no disruption in playback. After that, this buffer will constantly be refilled from HDD in the same way as above until the sample ends or playback has been terminated. Afterwards the 100ms buffer will be released. If the sample is being played again, the data will have to be fetched from SSD and HDD again. The 1ms is chosen in a way the SSD has enough time to respond, the 1+9=10ms are chosen in a way the HDD has enough time to respond.
The model described here bases on how I would implement sample streaming on an first impulse, because I have never done it. Please correct me if some (or all 😉 of my basic assumptions are wrong.
pps: it doesn't matter if it is PPC, intel. sparc, alpha, windows, OS X, solaris, irix, BSD, linux, whatever .... sample streaming has it's rules everywhere ...
Sample streaming has, but latency has not. Clearly there are well behaved systems and systems that are not. From my experience Windows is nightmare regarding latency.