Page Title

averagers achieve exactly these benefits, and manage to do so without harming the integrity of the individualistic, singular, unique transients in which real music abounds.

Enhancing PCM

These benefits from averaging can be realized for and applied to any PCM digital system, regardless of its intrinsic sampling rate and bit resolution. We call this concept hybrid PCM, and have discussed it in other articles in this IAR series. The term hybrid PCM refers to the fact that the concept borrows averaging technology from low bit resolution (including 1 bit) digital systems, and exploits it to benefit high bit resolution systems. Moreover, it introduces to PCM the interesting but alien concept of frequency shaped calculations and benefits. The basic premise of PCM is that it does not need any resolution enhancement help or any noise/error reduction help, since by design it maintains full resolution for each individual sample, and thus maintains this resolution equally for all frequencies in the spectrum of interest, even out to the most difficult highest frequencies of its spectrum. But the hybrid PCM concept teaches us that, even though PCM maintains full information out to its highest frequencies, there is actually extra information that is redundant coded at its lower frequencies - and we might as well use this extra information to, via averaging, improve PCM's performance even further at progressively lower frequencies. Averaging can't help a good PCM system much at its highest frequencies, but it can make a good PCM system even better through the crucial midranges, and even more so at yet lower frequencies. People have often puzzled why upsampling averagers seem to improve the sound of music's bass; well, that's exactly where averaging can make the biggest improvements in signal accuracy.
We've seen that averaging can make digital sound more musically real and natural, especially through the midranges where the ear is so sensitive, and at lower frequencies. If an engineer is designing an upsampling and averaging unit by ear, it's very tempting to get led astray and go overboard. Doing a properly limited amount of averaging (limiting the averaging span to one half cycle) can add a little musical naturalness that sounds gentler than digital's typical analytic hardness, and closer to analog. It's easy to jump to the conclusion that adding even more of this gentle musical naturalness must be even better, and would make the digital system sound even more like analog. Thus, it's very tempting to take the averaging span beyond the half cycle limit rule, thereby making the music even smoother and gentler.
There's no barrier to keep the overly enthusiastic engineer from designing in too broad an averaging span, and some PCM upsamplers on the market probably do violate this half cycle rule. But this is horribly wrong. It starts destroying the individualistic integrity of the original music signal, especially the singular transient information that is not part of a repeating pattern. It starts creating an output signal where some of the samples start being clones instead of unique individuals, and clones of a mere averaged lowest common denominator, as discussed above. As musical moments lose their individuality and become more like clones of an Orwellian average, the resulting emasculated music will sound even smoother and gentler and sweeter and softer. This aural pablum might subjectively please some ears, but it is a distortion of the true music, involving a serious loss of information, and should not be tolerated in any system pretending to accuracy.
You should be careful when evaluating and comparing various PCM upsampling averagers or resolution enhancers on the market (including those built into complete disc players). They will surely sound very different from each other, because there are so many parameters to play with in designing an averaging algorithm that literally reshapes the music. They will all surely also make a difference in your system, because they do change the music, so their output sounds different than their input. But making a sonic difference is not the same as making music better. They might in fact be making music worse, by losing information, especially individualistic transient information. They will give you altered music to be sure, and perhaps music that is easier on the ears, but it could also represent a degradation rather than an improvement in resolution. You should beware especially of those units that make music sound the smoothest and sweetest. They probably go beyond the legitimate limits of averaging, they are probably feeding you cloned pablum, and they are probably discarding some of the incisive articulation that individualistic singular transients give to music when they are correctly preserved from the input signal.

Enhancing DSD/SACD

Which brings us to DSD/SACD. DSD/SACD is a digital system with just 1 bit of resolution, the lowest, crudest resolution possible. But it is an oversampling system, so some degree of averaging is legitimate, to quiet noise and enhance bit resolution. How much averaging is legitimate for DSD/SACD, before it starts destroying music's individualistic transient information? For musical information at 20 kHz, DSD/SACD oversamples by a factor of 64 times. So it is legitimate for DSD/SACD to gather a span of 64 samples to average, for reproducing music's information at 20 kHz. This averaging would genuinely enhance the bit resolution of DSD/SACD, taking it up to 6 bits.
But of course 6 bits of resolution is still very crude, distorted, and noisy -- not nearly enough to qualify for anything resembling high fidelity. It would be a long stretch to enhance 6 bits up to 16 bits, but even that would not impress the marketplace today, because CD already achieved this 16 bit benchmark 18 years ago. Thus, the designers of DSD/SACD felt obliged to aim for a claimed bit resolution spec of 20+ bits, so the system could be promoted in today's marketplace.
If you have a digital system that can only give you 6 bits of resolution when you limit your averaging to a legitimate span that preserves individualistic musical information, how can you enhance its apparent resolution all the way up to 20+ bits? The only way is to severely violate the legitimate averaging limits, and pursue far more aggressive averaging that massively destroys individualistic musical information from the input signal, and that creates a very distorted output signal consisting largely of clones of a lowest common denominator averaged version of the input signal samples. The only way for DSD/SACD to climb from 6 bits all the way to 20+ bits is to climb on the backs of clones.
In the PCM example just above, we discussed a sliding or slanted scale, wherein it was legitimate to gather in a progressively wider span of samples at progressively lower frequencies, to be included in an averaging algorithm still limited to half a cycle, thereby legitimately giving us progressively better noise quieting and resolution enhancement at progressively lower frequencies. The same thing holds true for DSD/SACD. At 20 kHz DSD/SACD can only legitimately enhance its resolution up to 6 bits by averaging, and when it goes beyond 6 bits it starts destroying individualistic musical information and substituting averaged clones. Then, at 10 kHz it can legitimately enhance its resolution up to 7 bits by averaging, at 5 kHz up to 8 bits, at 2500 Hz up to 9 bits, at 1250 Hz up to 10 bits, at 625 Hz up to 11 bits, at 312 Hz up to 12 bits, etc. This means that DSD/SACD can do a credible job of reproducing music's bass and lower midrange frequencies, within the bounds of legitimate resolution enhancement via averaging that does not significantly destroy music's individualistic information. In other words, DSD/SACD does not need to do heavy cloning at lower frequencies, in order to give us decent noise quieting there.
Remarkably, these technical factors fully explain the total sound package that DSD/SACD actually delivers. Which is strong corroborative evidence that these technical factors are really operative and relevant and aurally significant here.
In short, DSD/SACD sounds very good in the bass and lower midrange. And then at progressively higher frequencies it sounds progressively more smoothed down and generically cloned. By the time we get to music's upper midrange and trebles, most of the music we hear consists of repeated clones of a mere average, so it sounds very gentle, smooth, and sweet - almost like the repeating clone sine wave that it has nearly become.
Ironically, music's upper midrange and trebles are precisely the spectral regions where the most interesting and important unique, singular transients occur - individualistic, non-repeating transients that define the beginning of an attack, or that reveal the textural noises that make a musical instrument or voice sound real. Yet these are precisely the regions where DSD/SACD tragically does its worse erasure of the individuality of the input signal samples, and does its heaviest cloning in creating its very different output signal. The spectral regions where it's most important to preserve individualism are precisely the regions where DSD/SACD erases it worst. That's the most searing indictment we can imagine of a music recording/reproduction system, that it destroys music the worst in precisely those areas where it most needs to be accurate.
When DSD/SACD discards information about singular event transients from the input signal, it discards nearly all of real music's sharp, hard attack sounds (since attacks are of course singular events) - so naturally DSD/SACD makes music sound rounder and softer. When DSD/SACD outputs samples that are uniform clones instead of being strikingly individualistically varied, as the input samples were, naturally it makes music sound more uniformly gentle and smoothed down. And when DSD/SACD outputs samples that are clones of a lowest common denominator average, naturally it makes music sound inoffensive, like pleasant pablum or elevator music.
Note that DSD/SACD can't really avoid committing this faux pas. The laws of physics and information dictate that you can't legitimately get as much benefit from the averaging technique at higher frequencies as you can at low. They dictate that if you do push averaging beyond its legitimate limits you will be destroying input signal information and substituting clones in your output signal. The designers of DSD/SACD were trying for 20+ bits of resolution across the spectrum. But the laws of physics and information gave them the least benefit, the least amount of legitimate averaging at music's higher frequencies, indeed only enough to get up to 6 bits of resolution. So it is the laws of physics and information that forced the designers of DSD/SACD to violate the legitimate limits of averaging most flagrantly in music's higher frequencies, and to thereby do there the worst destruction of individualistic information and the worst creation of a cloned, lowest common denominator averaged output signal.
Of course, the reason that the designers of DSD/SACD even got into this pickle in the first place is that they decided to make their 1 bit system's sampling rate so low, merely one fourth of their previous cheap consumer level 1 bit system called Bitstream. This lower sampling rate set a lower ceiling on how much the averaging technique could be legitimately used to enhance resolution, before running into the limit which when violated would start introducing cloning. A higher sampling rate choice at the outset would have meant that less aggressive averaging would have sufficed to achieve the desired enhancement to 20+ bits, and therefore more of the input signal's true individualism could have been preserved and less of the output signal would be cloned.

Full Circle to HFN Article

And now we can come full circle, back to the HFN article. It's likely that the designers of DSD/SACD did not worry about any possible negative consequences of employing overly aggressive averaging. It's likely that they were supremely confident that they could get away with a lower sampling rate for this new master recording system, only ¼ that of the cheap consumer playback Bitstream system, because they thought that they could use the averaging technique with virtually unlimited aggressiveness and power, to enhance resolution as much as they wanted to, and thereby easily get from 1 bit all the way up to 20+ bits without any penalty. And the reason that they mistakenly thought this was that their thinking was running in the same vein as the HFN article (indeed, we would expect an engineer reading and believing this HFN article to promptly go out and design a tragically flawed system just like DSD/SACD).
You see, the HFN article speaks of unlimited resolution enhancement possibilities, which as we now know can indeed be achieved by aggressive averaging of many, many input samples gathered together. And the HFN article uses as their model and proof signal only a single sine wave. It's likely that the designers of DSD/SACD likewise only used single sine waves as a signal to develop and test their new digital system. Guess what? DSD/SACD works great when there's only a single sine wave as an input signal. DSD/SACD preserves and reproduces the input signal very well, and with 20+ bit resolution and low noise. The designers likely only thought in terms of single sine wave signals, and likely only tested DSD/SACD with single sine wave signals. So they must have felt very confident and enthusiastic about their new digital system. It was doing everything right, and it wasn't doing anything wrong, wrong, wrong, wrong, wrong….
We now know better. We now know that the only reason that DSD/SACD would give the illusion of performing well on a single sine wave is that a sine wave is a perfect clone of itself and of its own average, both as an input signal and as an output signal. Consider any system which discarded all individuality of input signal samples, which lost all this information and collapsed it all into a single average, and which then reconstituted the output signal as mere repeating clones of this lowest common denominator average. A system which committed all these crimes would never appear to be doing anything wrong with a single sine wave as a test signal. Again, that's because a sine wave signal, far from being a typical test probe valid for representing most other signals, is actually a very rare and unusually weird signal. A sine wave, unlike all music signals, is an endlessly repeating clone of itself and of its own average. Thus, a sine wave can never reveal if any system has cloning problems. But music, being ever changing and being chock full of singular transient events, is very sensitive to any cloning problems.
The designers of DSD/SACD likely never even realized that there is such a thing as a cloning problem, just as the HFN article doesn't. They likely never even realized that music is fundamentally and crucially diametrically opposite to a sine wave in reacting to a system's cloning problems, just as the HFN article doesn't. They likely never even realized that a sine wave was exactly the one wrong signal to use for testing their system for performance on music, given the likelihood and important relevance of cloning problems. They likely thought, just as the HFN article encourages others to think, that a sine wave was a sufficient model and probe to reveal and demonstrate a system's resolution enhancement capabilities with all other signals including real music. Engineers are universally trained to rely totally on a sine wave signal for designing and evaluating audio equipment, since it is a quick and easy tool to use. And a unit's sine wave performance is sometimes relevant to and indicative of its performance with a real music signal. But emphatically not here, in the context of averaging for resolution enhancement.
Furthermore, when the designers of DSD/SACD finally sat down to listen to the output of their new system on real music, they probably liked the gentle, smooth, sweet sound they heard. They probably never compared the sound of DSD/SACD's output directly to the input signal, to discover whether it was inaccurately more smoothed down and gentle than the music going into the system.
One might think that PCM must suffer these same problems as DSD/SACD. After all, the laws of physics and information, dictating the slanted scale of permissible averaging for different frequency regions, certainly apply to both digital systems.
Why the doesn't PCM face these same problems as DSD/SACD? The answer is that PCM heroically succeeds at the difficult task of achieving its full advertised resolution (be it 16 bit or 24 bit) for each and every sample, and therefore also at the highest musical frequencies within its passband. Because PCM already has achieved full bit resolution (say 24 bits) even at 20 kHz, it doesn't need any resolution enhancement help at all from averaging. So, if averaging has only small benefits to bestow at 20 kHz if limited to a proper, non-cloning span, that doesn't bother PCM in the least, since it didn't even need these benefits in the first place.
In sharp contrast, DSD/SACD has only 1 bit resolution, so it needs to rely on averaging for a huge amount of enhancement boost, enough to get all the way up to its advertised 20+ bits. Therefore, DSD/SACD is very much at the mercy of the limitations of averaging and of the penalties if it does too much averaging. Of this 20 bit enhancement boost that DSD/SACD needs from averaging, a paltry 6 bits at 20 kHz can come from a legitimate amount of averaging that won't destroy music, while a whopping 14 bits must come from violating averaging's limits and thereby incurring the penalties of the cloning problem.
PCM can still employ the benefits of averaging, as we discussed for hybrid PCM. But PCM doesn't need these benefits at all, whereas DSD/SACD crucially does need them, and in much too big a way. So enlightened hybrid PCM can employ beneficial averaging just to gild the lily, and make an already good system even better, without ever needing to go beyond the averaging limits that start imposing the cloning problem.
The biggest mistake made by the DSD/SACD designers was that they needed to set a goal of a high amount of noise quieting and resolution enhancement, to achieve a marketing claim spec of 20+ bits, from a system with a much too low sampling rate. Therefore their primary goal became dramatic amounts of noise quieting and resolution enhancement, justifying very aggressive averaging, or whatever means and whatever sacrifices might be necessary to achieve this goal. During development, they probably saw dramatic improvements using a sine wave model or sine wave performance, just like that shown in HFN article. Somehow, they naively assumed that a system which worked so well with a sine wave would also work as well with real music. Somehow, the cloning problem never entered their thoughts. Somehow, they never realized that sine wave testing of an averaging system is utterly irrelevant to its performance with real music, because a sine wave (being already a self clone) simply cannot reveal how an averaging system is performing with respect to the cloning problem that can be so destructive of real music. Somehow, this analysis just might wake them up. And if not them, hopefully it has awakened you to the flaws of DSD/SACD.

back to table of contents