Digital System Wars

More Evidence on Sony DSD/SACD

     In IAR's 1998 Master Guide, we discussed a serious (we think fatal) sonic flaw in the Sony-Philips DSD standard, also proposed as a standard for their Super Audio CD format. That discussion was based on the evidence of one demonstration, a well executed A-B-R comparison conducted by Sony themselves at AES.
     Since we published that article, we have had the opportunity to further evaluate DSD and SACD, in two further demonstrations, also conducted by Sony and Philips. All three demonstrations were very different in nature from each other, and on different kinds of systems. Thus, we now have three very different kinds of evaluations in our journalist's pouch as evidence.
     Because these three evaluations are each different in nature, they draw an observational bead on DSD's performance from three different angles. It's like triangulating on a target, with three independent and different kinds of observations, taken from different angles. That's very important, since there's always a chance that observations in a single experiment might be faulty, as there might be an unknown peculiar fluke in the one experiment. But if you make independent observations, in three different experiments that are designed differently, then you are essentially looking at the same object from three different viewpoints. If all three independent viewpoints agree, you can be sure that the observed properties truly belong to the observed object itself, and are not merely a fluke of one observation vantage point nor a fluke of one experiment's design.
     In this case, all three evaluations of DSD, in three different kinds of experiments, all agreed, and perfectly corroborated each other. They all revealed the same fatal sonic flaw. So the case against DSD and Super Audio CD is now even far stronger than before.
     The second demonstration was conducted by Marantz (a high end division of Philips). This demo was based on CDs, rather than master tapes or computer hard discs. Thus its results are assuredly very relevant to what you could expect to hear from Super Audio CD in your home system. This demo was an instantaneous A-B comparison of exactly the same music, recorded onto two different CD formats, and played back from these CDs. The format pitted against Super Audio CD was not the true competition in today's world, the emerging CD standard from DVD-A, which allows 24/96 fidelity. Rather, this demo from Sony-Philips was showing off the alleged superiority of Super Audio CD to merely the ancient 16/44 CD standard. The Super Audio CD was played on a special CD player optimized for this new format, while the 16/44 CD of the same music was played through a standard Marantz CD player. Note that this put the 16/44 version under a bit of a handicap, since (as we all know) there are far better CD players that show 16/44 PCM CDs to better advantage than the Marantz. And, insofar as the SACD playback being optimal, one of Sony-Philips' chief selling points is that the playback circuitry is very simple and can be inexpensively optimized, as it presumably was in the special Marantz SACD player.
     So, how did the new SACD format compare to the handicapped and ancient 16/44 CD in this direct A-B comparison?
     In some sonic aspects, the SACD lost!! Above 8000 Hz the SACD sounded awful, especially on sibilants of the female singer, and on cymbal sounds from the drum kit. Whenever these musical notes came along, the ancient 16/44 PC CD sounded much cleaner, faster, and more open (remember, both CDs came from the same original master). The SACD exhibited a very trashy distortion on these musical notes, making them frazzled and smeared.
     This gross distortion heard from the Super Audio CD version was identical to the sonic flaw we observed during Sony's earlier A-B-R demo using master tapes and studio processors, and occurred on the same types of musical notes. As we discussed in our 1998 Master Guide, this seems to be a slew related distortion, like a digital version of TIM.
     This second demo confirmed our findings from the first demo, and it's an especially powerful confirmation because the system setup was so different. Moreover, since this demo employed the finished CD product rather than master tapes and studio processor loops, the findings of this demo are assuredly relevant to what you will hear from Super Audio CDs in your home system.
     If the new Super Audio CD loses out even to the ancient 16/44 CD above 8000 Hz, you can well imagine that it will be slaughtered above 8000 Hz by 24/96 PCM CDs, including both the present ad hoc audiophile 24/96 standard on DVD video and the different forthcoming 24/96 DVD audio standard from DVD-A. And indeed we found this to be the case (see below).
     In all fairness, we must also report that, below 8000 Hz, DSD and Super Audio CD sounded wonderful in this CD A-B demo, just as we found in Sony's earlier demo. The Super Audio CD sounds more open, airy, musically natural, and dynamic than 16/44 PCM CD below 8000 Hz; in direct comparison, the 16/44 CD sounded more canned, glazed, constricted, and closed in.
     As we discussed previously, this means that the basic principles behind Super Audio CD are valid, but that the sampling rate is not nearly high enough to support the higher frequencies of the audio spectrum with decent fidelity. In a 1 bit system like DSD-SACD, a very high sampling rate is required in order to handle music to 20,000 Hz, and to handle steep, high slew rate musical notes such as vocal sibilants and cymbal sounds. The present DSD-SACD sampling rate is only good enough to cover music up to 8000 Hz. This is simply unacceptable as a high fidelity medium. It's like having a speaker system without any tweeter. Actually it's even worse than that, since a speaker system without a tweeter would merely sound dull, and would not actively distort treble information, while DSD-SACD does grossly distort music's trebles.
     Many listeners react favorably to the sound of DSD-SACD. They are obviously so entranced by the improved musical naturalness below 8000 Hz that they fail to notice the gross distortion above 8000 Hz on certain musical notes.
     The third demo was Sony's current professional road show, for studio engineers. This was a single ended demo, with no A-B comparisons. It's worth reporting on because it showed off DSD to its very best advantage. The playback system included Sony's own very revealing speakers, and the source was as good as it gets, a studio master hard disc. Thus, we were treated to the very best possible sound of DSD, coming directly off the master recorder.
     How did this sound? Again, up to 8000 Hz the sound was wonderful: open, airy, natural, and dynamic. But again there were severe sonic flaws above 8000 Hz, especially on musical notes requiring a high slew rate. One revealing track was an a capella chorus. Every sibilant was grossly mangled.
     This mangling showed that DSD did a number of things wrong, which are worth a brief analysis. A live vocal sibilant is supposed to sound like clean, open white noise, like a jet of escaping steam. Try saying "ssssss" and listen to the sound. Notice that your teeth are bared, with your lips pulled back. Now say "moon", and then say just the "ooooo" part of "moon". Notice that your lips are cupped way forward, and are cupped into a circle. Next, say "ssssss" again, but this time force your lips into the same forward circular cup as they had while you were saying "ooooo". And finally, continue to say "ssssss" while moving your lips between this forward, cupped position and the pulled back teeth bared position. Notice that the sound of the "ssssss", your vocal sibilant, changes character drastically as you move your lips back and forth between these two positions. In the natural position, with lips pulled way back and teeth bared, your sibilant has a bright, open, white noise sound. This is what a live vocal sibilant sounds like, this is what an accurate recording should sound like, and this is what good PCM digital sounds like (both 16/44 and 24/96). In the artificial position, with your lips cupped forward, the pitch of the same "ssssss" sibilant drops, the sound is duller, the sound no longer has its natural spectral balance (the open, bright white noise sound of steam escaping), and the sound is closed in rather than open (as if it were trapped in a tunnel).
     This is what DSD did to the vocal sibilants of the chorus in this master recording. Whenever a vocal sibilant came along, the pitch apparently dropped lower, as if the singers had cupped their lips forward while singing every sibilant.
     DSD also mangled these sibilants in other ways. Try saying "ssssss" again (normally, with lips back and teeth bared). Notice that the natural sound consists of lots of little spikes of individuated noises. The only reason that you can hear these noise spikes as individuated, and subtly different from each other, is that there are instants of relative intertransient silence between the spikes. Now try saying "shoosh". Notice that the "sh" sound smears the spikes together into a more homogenous sound, and that there are no longer individual spikes of noise with high peak amplitude.
     DSD does this same kind of mangling to sibilants. It reduces the amplitude of the individual peak spikes of noise, and smears the energy over time, filling in what should be intertransient silence between spikes. DSD might have excellent dynamics at lower frequencies, but in the trebles it sonically acts as a dynamic compressor, squashing the peaks. DSD then sonically takes this lost dynamic peak energy and smears it over time, filling in the spaces between transients so that the transient sounds lose their individuality, instead becoming blended and smeared into a homogenous slur. DSD changes "ssiss" into "shoosh".
     This mangling of vocal sibilants was striking on the master recording of the a capella chorus, because the recording was so superb at lower frequencies, and because there were no other instruments playing at the same time that might have masked this mangling. We heard this mangling, and another audio pro at this same demo also heard it, being bothered enough by it to speak up about it to others.
     Why should DSD-SACD have a too-low sampling rate problem, that leads to these fatal sonic flaws above 8000 Hz? After all, this is a studio mastering and archiving system, which is supposed to have data capability even beyond any consumer distribution medium. And this system is being born in the age of high density laser discs (such as DVD), with ample storage to support high sampling rates.
     DSD's too-low sampling rate is even more puzzling, and more shocking, when we look at a bit of audio history. Philips was one of the pioneers of noise shifting, i.e. time averaging of oversampling, a technique which allows fewer bits to do the work of more bits, at least for lower frequencies where there are enough samples to average. In their first application of this technique, Philips reduced the bit resolution only a slight amount, from 16 bits to 14 bits, and they offset this slight resolution loss by oversampling by 4 times, at 176 kHz instead of 44 kHz. This was an equitable tradeoff of information content, with 4 times less resolution traded for 4 times geater bandwidth (although not a perfect tradeoff, since the time averaging failed to offer genuine 16 bit resolution at music's highest frequencies).
     Then, some years later, Philips was trying to find a way to build really cheap CD players for budget consumer systems. They came up with a really cheap chip set by reducing the bit resolution from 16 bits all the way down to 1 bit, and they called it Bitstream. With such a large reduction in bit resolution, the oversampling should have been increased to 32,000 times, if they wanted to preserve an equitable tradeoff of information content (to preserve basic information content, the sampling rate should be doubled for every bit dropped from resolution). But Philips didn't do this. Instead, they increased the oversampling to only 256 times the nominal 44 kHz (thus providing 1 bit sampling at 11.3 MHz). Why such a compromise, of only 256 times oversampling instead of 32,000 times oversampling? Remember that this Bitstream system was intended only for the cheapest consumer CD players. It was not intended to even replace Philips' own more expensive multibit consumer CD players. And it was most certainly not intended to become a studio mastering and archiving system. Note that this was over 10 years ago, when the state of the digital art was far more primitive than it is today, and digital media did not have the large storage capability to support the high sampling rates that today's media do.
     So, before we go forward, remember and keep this key fact in mind: over 10 years ago, when digital was primitive and storage media limited, Philips designed a compromised 1 bit system for only the cheapest consumer CD players, and they still gave it 256 times oversampling as a sampling rate.
     Now let's fast forward to the present. Now we have more sophisticated digital systems, and digital media with much higher storage capability and faster transfer rates, so we can engineer and we can afford higher sampling rates than we could 10 years ago. Now we see Philips and Sony collaborating on a new digital standard which is not intended as just a compromise for the cheapest consumer CD players, but also for the best consumer CD players, and also even for the holiest of holies, studio mastering and archiving of music for generations to come (which obviously merits the very best possible fidelity, without compromise).
     Naturally, from all these considerations, one would expect that this new standard would have a much higher sampling rate than the compromise system developed 10 years ago only for the cheapest consumer CD players. One would expect therefore that DSD-SACD (also a 1 bit system)would oversample at some rate much higher than the 256 times of that ancient Bitstream cheap consumer compromise.
     So, how much higher, how much better, than 256 times oversampling, is the oversampling that Sony and Philips have put into DSD-SACD, the modern new mastering standard for the ages? Is it perhaps 512 times oversampling, twice as good? Is it 1024 times oversampling, 4 times better?
     No.
     It's actually 64 times oversampling, which is 4 times worse!!! DSD-SACD, the modern new mastering standard for the ages, samples music at only 1/4 the sampling rate used 10 years ago by Philips' own Bitstream, intended only for the cheapest consumer CD players of those primitive ancient times. Bitstream's 1 bit system sampled at 11.3 MHz, but DSD-SACD samples at only 2.8 MHz.
     Remember that Bitstream's 256 times oversampling was already a compromise for cheapness. If Bitstream were to have preserved the same information content as the 16/44 multibit CD player, it would have to have been given an oversampling rate of 32,000 times.
     You'd think that any move toward mastering quality, and/or toward modern digital standards and capabilities, would require an oversampling move to a higher number that would at least equal this 32,000 times (which would make it the informational equivalent of 16/44 multibit). But Sony-Philips didn't make DSD better than Bitstream, or equivalent to 16/44 multibit. They didn't even make it equal to Bitstream. Instead, they made it worse than Bitstream. Four times worse! What a travesty!
     No wonder DSD-SACD has such problems mangling music's high frequencies! It's a giant step backwards in sampling rate, down to a sampling rate that is simply too low to accurately capture music's fastest waveforms with a 1 bit system.

Update on DVD Video for Audio

     In our 1998 Master Guide we reported that early DVD video players sounded flawed playing CDs in general, including the ad hoc 24/96 audio discs. We found that DVD players sounded artificial and fatiguing, with mediocre transparency and resolution, throughout the spectrum. Furthermore, in music's trebles they also sounded sizzly, sandy, edgy, brittle, and frazzled. In sum, they sounded like a giant step backwards, to the likes of primitive early digital. So we wrote an IAR Consumer Alert warning to you in January 1998, which came off the presses as part of the 1998 Master Guide a few months later, in early 1999.
     We knew we were sticking our neck out. Other popular audio publications had been singing the praises of DVD players for audio, so much so that they singlehandedly virtually killed off the high end CD player industry. But this seemed like a gross injustice. Even moderately good CD players sounded so much better than DVD players on music (cf. our 1998 comparison of Resolution Audio's modest CD player against their custom DVD player), so certainly high end CD players are far better yet than DVD players. So we knew we had to speak out and warn you that we saw no clothes on the new emperor. Since we accept no ad revenue and have no axes to grind, we are free to call them as we hear them. But it still felt risky flying in the face of generally universal published opinion on DVD for audio.
     You can imagine our relief when, months later, we opened the October 1998 issue of Hi-Fi News and saw Paul Miller's article, DVD The Truth, in which he came to the same conclusions about the dire sound of 6 audiophile DVD players, based on sonic evaluations by his listening panel of experts. His panel joined us in calling the sound of these DVD players a giant step backwards. Paul's panel also corroborated the findings of our comparison with a moderately priced CD player; they found that a moderately priced CD player, the Arcam Alpha 9, sounded so much better than all 6 DVD players that it was in another league.
     Independent corroboration always feels good to any scientist who has stuck his neck out. And it reassures you the reader that IAR's early warning to you was right. As they say, you read it here first.
     Now we have some more news for you about DVD's capability for music. And this time it's good news, or at least promising news.
     The first item is that we have finally heard some decent sounding music coming from a DVD video transport. The DVD player happened to be a Pioneer Elite, and it was being used only as a transport, feeding a high end outboard decoder, from ad hoc 24/96 discs. Finally we heard some musically credible trebles coming from ad hoc 24/96 discs.
     It's possible, though, that the good sound depended crucially on using the Pioneer DVD only as a transport, since this might have avoided the signal contamination that's inevitable nowadays from

(Continued on page 18)