The half frequency theory (Nyquist) does state that it assumes a perfect bandwidth limit (a theoritcal brick wall filter - which does not seem possible), and there is a slight "grey" area when it comes to exactly half the frequency (that would be 22.05 KHz sampling at 44.1KHz). So this is why the frequency response of a sampling frequency is always less than half, to allow for a real world filter.
You must not allow anything over half the sampling frequency to be sampled, because it will alias, and you cannot sort out aliasing on playback, because the frequencies have already been "folded back" are then part of the sub 20KHz signal.
It is how good this filter is, that determines the quality of an A/D conversion, if the filter is weak it may allow aliasing to happen, if it is over zealous, it will start to affect audible frequencies that it should not be affecting. A good quality filter design will perfectly deal with the critical 22Khz frequency, yet be perfectly out at 20KHz. But - some converters will be struggling. If you shift the sampling to 96KHz, you can put a very lazy filter in there that easily kills any aliasing, and does not affect sub 20KHz frequencies. So it's a lot easier to make a perfect 96KHz sampling frequency than it is a 44.1KHz one. Which is why some audio interfaces "seem" to sound better at 96KHz.
It is however better to buy a top quality audio interface and run at 44.1KHz (or 48KHz if your output medium requires it), than invest in a super fast computer that has to deal with 96KHz sampling. The end result wil be the same. Especially as the final product is unlikely to be at 96KHz anyway.