Vienna Symphonic Library Forum
Forum Statistics

185,650 users have contributed to 42,400 threads and 255,554 posts.

In the past 24 hours, we have 4 new thread(s), 15 new post(s) and 47 new user(s).

  • the secret besides the algorithm itself is the used dictionary matching (see my post above) the particulary data.

    it should work with a similar efficiency on harps and acoustic guitar - it will never work with brass.

    dig out your old exs install DVDs and see how good some instruments compressed whereas some do not ...

    christian


    and remember: only a CRAY can run an endless loop in just three seconds.
  • last edited
    last edited

    @julian said:

    I've never known an audio file zip save more than about 10% (90% of the original size) except when there has been silence in the file, whereas lossy compression can be extremely effective but with the downside of compromising the original quality.

    Suppose you had 100 different audio files, each one being a different pianist playing the same piano piece. Each audio file will be unique because two pianists would play the same piece in precisely the same way, and each individual audio file would probably compress to a zip about 90% of the original size.

    However, suppose you took all 100 audio files and put them in a single zip or linked all the audio files into a large single audio of all 100 performances and zipped that. My suspicion is that because of the similarities between each audio file, you would achieve a better compression (maybe worth an empirical experiment).

    Matthew

  • last edited
    last edited

    @julian said:

    If you interpolate data between samples it no longer is a true representation of the original sample. Also if you were to make two consecutive recordings of a piano string being hit by a hammer using the same midi velocity then tried to phase cancel these 100% it just would not happen as a piano is an analogue instrument not a digital waveform.
    I wasn't suggesting interpolating data.

    However, taking your cancellation example. I am willing to believe that the CEUS is sophistcated and accurate enough that if you programmed it to play the same note in the same way in the same acoustic environment and then tried to phase cancel these, you could achieve in the region of 90% cancellation.

    I am also willing to believe that if you got the CEUS to play the same note in the same way but just one velocity layer apart (given that we are talking about 100 velocity layers overall), then and attempted to phase cancel those waves out you could achieve close to 90% cancellation.

    And it is basically those sorts of properties that this sample set lends itself to a high compression ratio, regardless of the compression algorithm used (although I'm happy to believe that VSLs algorithm is optimised for this sort of data).

    A good benchmark if VSL is willing to try it, would be for someone in VSL to take the full 500GB uncompressed data and put it in a big zip file, and then let us know what compression that achieves. It may not be as good as the 10:1 ratio (and a bit embarrasing for VSL it it turns out better!!), but my suspicion is that it will not be too far off (8:1 - 9:1).

    Matthew

  • last edited
    last edited

    @mdovey said:

    [quote=julian]I've never known an audio file zip save more than about 10% (90% of the original size) except when there has been silence in the file, whereas lossy compression can be extremely effective but with the downside of compromising the original quality.

    Computers do not have a subjective button. There would be absolutely no file saving by zipping 100 performances of the same piece by different pianists! compared with 100 totally different pieces. CM will, I'm sure, confirm this!

    Here's an analogy: You want to build a house from handmade bricks. It takes 10,000 bricks to build the house but you don't want to transport 10,000 bricks so you transport 1000 bricks and make 9,000 machine reconstructed bricks based on the 1000 originals. The house created from the machine bricks may look the same (for 99% of viewers) as the house built completely from totally handmade bricks BUT for the purist the machine reconstructed brick house is not original handmade materials.

    Now what's the Vienna Imperial compared with the original instrument?!

    Julian

    Julian


  • Good Lord, this thread is nuts.  Very much looking forward to more demos.


  • last edited
    last edited

    @mdovey said:

    [quote=julian]If you interpolate data between samples it no longer is a true representation of the original sample. Also if you were to make two consecutive recordings of a piano string being hit by a hammer using the same midi velocity then tried to phase cancel these 100% it just would not happen as a piano is an analogue instrument not a digital waveform.
    I wasn't suggesting interpolating data.

    You will not get cancelation as there are too many external variables (the reflections of the room, the felt on the hammers, the ribbons in the microphone, even, if you get to minute levels, the temperature of the string when it is struck a second time) all these do not conform to midi 1-127 level differences they have almost infinite variations in reaction.

    So you will not get cancellation so they are not the same it's either the same or different. There wouldn't be such a thing as 90% of shared wavelengths at bit level. If you want to interpolate then yes algorithms will achieve this to a lesser or greater extent but it is not the same recording - it is a re-construction.

    Julian


  • just one thing to say:

    playing acoustic music on loudspeakers always means re-construction: soundwaves to electrical current in the mic, plus maybe digitalizing and decoding to analogue, then reconstructing the soundwaves my moving the loudspeaker membrane.

    that´s the nature of it. you never listen to the original, but to an electro-acoustic (plus maybe digital) reconstruction of the original.

    if that reconstruction sounds good, it´s good. if vsl data compression sounds good, it´s good.

    but here is the solution: if vienna imperial does not sound "original" enough, we just have to buy a boesendorfer ceus imperial. 

    i tested it, it´s really really marvellous. 130k and you´re done.


  • last edited
    last edited

    @clemenshaas said:

    130k and you´re done.

    Jeez..it was a mere 15k last night Clemens. For 130 I'd be hoping for Julian's 10,000 brick apartment to be part of the deal. [:D]

    Colin


  • last edited
    last edited

    @julian said:

    Computers do not have a subjective button. There would be absolutely no file saving by zipping 100 performances of the same piece by different pianists! compared with 100 totally different pieces. CM will, I'm sure, confirm this!

     

    Here's an analogy: You want to build a house from handmade bricks. It takes 10,000 bricks to build the house but you don't want to transport 10,000 bricks so you transport 1000 bricks and make 9,000 machine reconstructed bricks based on the 1000 originals. The house created from the machine bricks may look the same (for 99% of viewers) as the house built completely from totally handmade bricks BUT for the purist the machine reconstructed brick house is not original handmade materials.

     

     not everything that is flawed is a comparision ... christian


    and remember: only a CRAY can run an endless loop in just three seconds.
  • last edited
    last edited

     

    @mdovey said:

    A good benchmark if VSL is willing to try it, would be for someone in VSL to take the full 500GB uncompressed data and put it in a big zip file, and then let us know what compression that achieves. It may not be as good as the 10:1 ratio (and a bit embarrasing for VSL it it turns out better!!), but my suspicion is that it will not be too far off (8:1 - 9:1).
    hehe - nice idea, but sorry, we need our computers to do some work 😉 have you ebver tried to compress 500 GB?

    as an average value - C4 vel 64 1s:zip: 92% rar 71% ... not too embarrassing 😉 christian


    and remember: only a CRAY can run an endless loop in just three seconds.
  • last edited
    last edited

    @Football said:

    Good Lord, this thread is nuts.  Very much looking forward to more demos.

    Nuts are of little use without a thread, therefore a thread is of little use without a nut.

    Discuss


  • Thanks Christian. Compression algorithms are tricky beasts to evaluate even with insight into the algorithm itself. Even harder when generic algorithms and their efficiency at compressing hypothetical data are compared. It looks like the samples are quite easily compressed if zip (which obviously wasn't created to compress piano samples) was this effective. Aldo

  • Sorry to be a butt here and point this out, but if a piano has 88 keys and there are 1200 samples per key; then why are there only 69,633 samples?  88 keys x 1200 samples per key = 105,600 samples.  I apologize if this question has already been asked and answered, I did a search for it and didn't find anything.

    By the way, Jay's demo sounds absolutely incredible, by far the most fantastic sampled piano in the universe....that is until VSL creates the next one! [:)]


  • I have to agree with you Julian in the sense that if you simply overlaid two samples, even if the velocity was identical and phases were aligned as much as possible at the beginning of the sample, that by the end the two would be out of phase. However, mdovey's approach would need little modification to do a pretty good job. With Christian's test with zipping one of the files, I doubt anything this cumbersome is needed, but here's the idea. Most of the energy of the string is at some fundamental frequency. Since a good number of the keys in the piano have three strings, technically you'd have to determine 3 close, but not identical frequencies for these and their phase alignment. For a given key at a given velocity, the amplitude is probably easily described as a function of time. Applying this envelope to the phase aligned fundamental frequency mashup thingy and subtracting it from the sample would probably remove most of the energy of the sample. All of that data that was removed could be easily represented in a handful of parameters that would take only a few bits and could be stored as metadata. The residual sample data could then be expressed with far fewer bits. As big a pain in the butt as this would be, all the heavy lifting is on the encoding side. The decoding is much simpler. A lot more refinement would be needed to achieve a 10:1 compression ratio. Neat idea, but why waste the time developing from scratch some crazy algorithm when you can just tailor something like zip and get the same result. I think Football is right, this thread is nuts. Probably need a new thread just to debate the compression algorithm.

  • last edited
    last edited

    @Aldo Esplay said:

    I have to agree with you Julian in the sense that if you simply overlaid two samples, even if the velocity was identical and phases were aligned as much as possible at the beginning of the sample, that by the end the two would be out of phase. However, mdovey's approach would need little modification to do a pretty good job. With Christian's test with zipping one of the files, I doubt anything this cumbersome is needed, but here's the idea. Most of the energy of the string is at some fundamental frequency. Since a good number of the keys in the piano have three strings, technically you'd have to determine 3 close, but not identical frequencies for these and their phase alignment. For a given key at a given velocity, the amplitude is probably easily described as a function of time. Applying this envelope to the phase aligned fundamental frequency mashup thingy and subtracting it from the sample would probably remove most of the energy of the sample. All of that data that was removed could be easily represented in a handful of parameters that would take only a few bits and could be stored as metadata. The residual sample data could then be expressed with far fewer bits. As big a pain in the butt as this would be, all the heavy lifting is on the encoding side. The decoding is much simpler. A lot more refinement would be needed to achieve a 10:1 compression ratio. Neat idea, but why waste the time developing from scratch some crazy algorithm when you can just tailor something like zip and get the same result. I think Football is right, this thread is nuts. Probably need a new thread just to debate the compression algorithm.

    I may be misinterpreting this but weren't Christians results a saving of just 8% with the zip and 29% with the rar? This is lossless in that the complete file is re-created on decoding (un-zipping) So the question is: is the VSL system actually a zip like function where the exact file is reconstructed on decoding (unlikely, i would guess at 10:1) or in fact data compression where 90% of the original data is removed forever? And if this is the case how much is this changing the original sound quality. I would have thought most good musicians/engineers would detect 10:1 compression ratios in A/B comparison however good the algorithms used - particularly when there is low level ambience involved (the room mics recordings)

    Julian


  • last edited
    last edited

    @Aldo Esplay said:

    I have to agree with you Julian in the sense that if you simply overlaid two samples, even if the velocity was identical and phases were aligned as much as possible at the beginning of the sample, that by the end the two would be out of phase. However, mdovey's approach would need little modification to do a pretty good job. With Christian's test with zipping one of the files, I doubt anything this cumbersome is needed, but here's the idea. Most of the energy of the string is at some fundamental frequency. Since a good number of the keys in the piano have three strings, technically you'd have to determine 3 close, but not identical frequencies for these and their phase alignment. For a given key at a given velocity, the amplitude is probably easily described as a function of time. Applying this envelope to the phase aligned fundamental frequency mashup thingy and subtracting it from the sample would probably remove most of the energy of the sample. All of that data that was removed could be easily represented in a handful of parameters that would take only a few bits and could be stored as metadata. The residual sample data could then be expressed with far fewer bits. As big a pain in the butt as this would be, all the heavy lifting is on the encoding side. The decoding is much simpler. A lot more refinement would be needed to achieve a 10:1 compression ratio. Neat idea, but why waste the time developing from scratch some crazy algorithm when you can just tailor something like zip and get the same result. I think Football is right, this thread is nuts. Probably need a new thread just to debate the compression algorithm.

    I may be misinterpreting this but weren't Christian's results a saving of just 8% with the zip and 29% with the rar? This is lossless in that the complete file is re-created on decoding (un-zipping) So the question is: is the VSL system actually a zip like function where the exact file is reconstructed on decoding (unlikely, i would guess at 10:1) or in fact data compression where 90% of the original data is removed forever? And if this is the case how much is this changing the original sound quality. I would have thought most good musicians/engineers would detect 10:1 compression ratios in A/B comparison however good the algorithms used - particularly when there is low level ambience involved (the room mics recordings)

    Julian

    Hmm. That'd change everything. Zip primarily uses "deflate" which makes use of LZ77, which is a dictionary coder. Christian hinted that the VSL is using a dictionary matching coder, so I'd guess it's a lot like zip. I'm guessing that either the sliding window is larger to increase the chances of a match, or that a static dictionary is used instead to reduce the dictionary size. With this being a fixed library and the compression being geared entirely to this specific library, I'd guess the latter. Word size makes a big difference. At 24 bits, if you can represent 10 consecutive samples with a 24-bit dictionary reference, then you could achieve 10 to one for that one string. With entire library available, an extremely optimized static dictionary could be built. Although it takes forever to build, as long as the entire dictionary can be read into memory (like 2 MB), then the CPU can assemble the actual sample file as it is read from disk. If this is the case, then it would be lossless. Hopefully VSL has been able to optimize the dictionary to do it in a lossless manner. I'm sure they wouldn't go through all of the trouble to sample this much data just to screw it up with compression. More likely they had a goal to compress as much as possible without losing data and the magic ratio came out to around 10:1. Aldo

  • so finally you come to the same conclusion i've already posted earlier - most if not all is all aquestion of the dictionary ;-)

    i fact it was a must - to have a piano needing 5 GB or more per mic position would have been limiting ...

    christian


    and remember: only a CRAY can run an endless loop in just three seconds.
  • last edited
    last edited

    @Football said:

    Good Lord, this thread is nuts.  Very much looking forward to more demos.
     

    Here they are, two great pieces produced by Jay Bacal, this time you can hear the piano in a chamber music context (with bassoon or trumpet)

    http://vsl.co.at/en/67/702/704/414.htm

    Saint - Saens, Bassoon Sonata 3rd mov

    Steven Halsey, Trumpet Sonata - Allegro

    best

    Herb


  • Ah, that's great.  Any closer perspective/more contemporary sounding demos coming soon?


  • Contemporary demos would be great....  also, I would love to hear something in more of a cinematic type of setting... the soft piano behind the orchestra.  That's what I'm mostly looking for... - that and more of the the close perspective.