well also be a further cause of that burst of white noise (discussed just above) which we hear accompanying and even being substituted for genuine upper treble musical information.
      The second problem is that this mountain of noise could be responsible for causing the ugly distortion artifacts we hear when the music has strong upper treble energy, as with vocal sibilants and cymbal hits. This ugly distortion might arise because the mountain of noise intermodulates in an ugly way with strong upper treble music energy when it comes along. Or this ugly distortion might arise because the strong upper treble music energy overloads the aggressive digital calculations of DSD/SACD's algorithm, either directly overloading it or perhaps stimulating the mountain of noise peaking at 53 kc into an overdrive condition. As you know, analog amplifiers can clip momentarily, chopping off the signal with ugly sounding distortion, if they are overdriven by a music signal that temporarily has too high an amplitude, or that temporarily has too much stressful high frequency energy. Also, as we all learned during the days of TIM discovery, an analog amplifier can temporarily clip its own input via its feedback loop, and this problem is exacerbated because the amplifier multiplies its input signal (via its gain) before sending that multiplied signal back to its own input via its feedback loop. Well, much the same thing can happen with the digital calculations of signal processing algorithms, especially with the more powerful higher order algorithms, like the aggressive fifth order algorithm used for DSD/SACD's pattern recognition and averaging noise quieting. These digital algorithms have recirculating loops that are like feedback loops, and these algorithms do extensive multiplications of the signal input numbers, with the multiplied products at the output being sent back to the input via the recirculation loop (just as with the feedback loop of an analog amplifier). If the algorithm is aggressive and complex, there are a lot of multiplications and a lot of recirculations through loops, and then there could easily be certain input signal conditions which could cause overloads of these calculations. Essentially, the repeated multiplications and repeated recirculations would get out of hand, generating numbers so large that there would be a temporary overflow in a digital number buffer (akin to a temporary clipping overdrive of a stage in an analog amplifier), which would temporarily clip the signal, causing a temporary burst of ugly distortion. Our hunch is that the DSD/SACD design engineers, trying to get the maximum quieting out of a system with such crude intrinsic 6 bit resolution, pressed the complexity and aggressiveness of their algorithms all the way to the limit of overdrive clipping, and had to make assumptions that the treble energy of the input music signal would never exceed certain profile limits. However, occasional bursts of real music do in fact exceed their assumed profile limits, being richer in upper treble energy than the DSS/SACD engineers predicted, and these might be overloading the digital calculations of their aggressive algorithm.
      Whatever the exact cause of these sonic problems, it seems intuitively clear that DSD/SACD's huge mountain of noise, rising just beyond the audible spectrum and peaking at just 20 db below maximum full scale amplitude at 53 kc, is asking for trouble. And it is certain that DSD/SACD does indeed crash and burn when music with strong upper treble content comes along, going into paroxysms of distorted sonic cringing.

Too Much Enhancement

      The aggressive pattern recognition and averaging algorithms of DSD/SACD are a form of high power averaging, and we have praised this technique in other articles. So why are we castigating its effects here?
      High power averaging is most effectively beneficial when it is used with high resolution multibit digital systems that don't really need it in the first place, systems with CD's 16 bit resolution or higher. Applied to these systems, high power averaging can provide further beneficial enhancements in resolution and musicality. It's like the old story of bank loans. You have the best chance of getting a positive result from a loan application when you don't really need the money.
      Digital systems with high intrinsic resolution already have kept the background noise down to a very low level, so their averaging algorithms don't need to be aggressive about lowering noise and garbage a huge amount, as DSD/SACD does. They can be satisfied with merely enhancing the resolution and accuracy of the music they are already satisfactorily reproducing. Their reproduction of the actual input signal is satisfactory enough so that they don't need to invent and fabricate a fictional version of the music signal, and so that they don't need to aggressively impose upon all the music a uniform average of a repeated pattern.
      Furthermore, multibit PCM systems with decent intrinsic resolution (16 bits or higher) achieve full intrinsic resolution and quieting even at their highest frequencies of operation. This means that they don't need to subdue and sacrifice the distinct individuality of singular treble transients to the Orwellian gods of averaging uniformity (as DSD/SACD must), trying to achieve better quieting and resolution in the upper trebles. In other words, multibit PCM can afford to have an averaging algorithm leave its upper trebles substantially alone, leaving the singular distinct transients to sparkle. Multibit PCM can have its averaging algorithm gradually take progressively greater effect at progressively lower frequencies. Here, at these lower frequencies, an averaging algorithm working with PCM can enhance resolution and accuracy without harming individual transients. That's because PCM already has redundant information for these lower frequencies, repeating as a pattern over many PCM samples, so it does no harm to average this already repeating pattern. With multibit PCM, the beneficial averaging algorithm can be designed so it benignly averages only what is already a repeating pattern, and does not seek to impose the tyranny of the lowest common denominator on any singular individuals. The averaging algorithm can be designed to reduce amusical errors, distortions, background noise, and to slightly enhance resolution - while leaving the accuracy of individual musical details intact.
      In contrast, the intrinsic resolution of DSD/SACD is so poor and crude that, for a typical classical music note, all detail smaller than 1/6 of the full amplitude of the note is totally lost in garbage and noise. When you think of noise, you usually think of background noise. But with DSD/SACD, this is no mere background noise. In point of fact, it is incorrect to say that all that musical information, smaller than 1/6 of the full amplitude of this classical music note, is present but merely amongst loud background noise. Rather, it is gone, period. All that musical information is actually gone for good, disappeared amidst the high tide of rubble generated by DSD/SACD's crude 6 bit instrinsic resolution. DSD/SACD can't even see or detect this musical information. That's like not being able to see any object smaller than 5 miles big on our re-scaled 32 mile trip. And all this lost information cannot be truly rescued or recovered or reproduced, by any mechanism, ever again. That's an absolutely fatal problem for a digital system that pretends to the throne of being relied on as a music archiving system.
      DSD/SACD cannot reproduce any of this lost information ever again, in the literal sense of reproduce. DSD/SACD doesn't have and cannot have a clue what that information lost under the rubble actually looked like. The only thing that DSD/SACD can do is to invent a new music waveform, to fabricate a fictional guess what some (not all, just some) of this musical information lost under the rubble might have looked like. Crucially, it can only make guesses about one kind of lost information, repeating patterns, and it does this via its aggressive pattern recognition and averaging algorithm. It cannot hazard any guesses about lost singular musical transients that did not repeat themselves many times. It can sift through the rubble of noise and garbage looking for repeating patterns of that noise and garbage. But it cannot detect any singular, individual, non-repeating musical information in this sifting, because that non-repeating information looks just like another piece of noise. And so its fabricated fictional guess, about the nature of that lost information, naturally extracts and brings to the foreground only repeating patterns, while leaving individual timbral and textural and transient music sounds behind, buried forever in the rubble.
      The algorithm can only detect the repeating patterns among the noise, and then of course it has to use this basis for its fabricated fictional guess on the nature of all the entirety of the buried information. It has no choice, since it has no other basis for its fabricated guess. If there are grey rocks and also many brightly, distinctly colored individual rocks hidden under the sand, but your vision can only detect the repeating grey color among the rocks, then your description of the entire rock layer will be as being a uniform single grey color, missing all the vibrantly colored singular rocks.
      Indeed, the DSD/SACD algorithm not only detects and brings to the foreground only repeating patterns, and not only ignores all non-repeating buried musical information as if it were noise, but it also actually subdues and further hides this valid and valuable singular musical information, because it shifts this information, which looks like noise, out of the audible spectrum and into the ultrasonic region where it shifts the true noise and garbage.
      Just how bad is this problem? Just how much of the signal output from DSD/SACD is fiction instead of fact? Just how inaccurate is DSD/SACD?
      Some of what DSD/SACD is real, due to its true intrinsic resolution. But, as soon as DSD/SACD's true intrinsic resolution is exceeded, the rest of what it gives us is fabricated fiction, from guessing about what is buried beneath the rubble; it is emphatically not a reproduction of the input signal. And, as we have just seen, this guesswork is erroneous fiction, because it is based on the algorithm's limited ability to detect only repeating patterns within the noise, and its inability to detect singular musical transient information within that noise. That portion of the music output signal which comes from this DSD/SACD guesswork will be very inaccurate and wrong, because it will include only the smoothed down repeating patterns of music, and will subdue or exclude the distinctive sparkling unique transients (the initial transient attack, the many unique timbral and textural noises, etc.).
      So what are the proportions between the real vs. the fictional, erroneous guesswork, from DSD/SACD? Since our typical classical musical note is at 1/10 of the system's maximum full scale amplitude, the intrinsic noise and garbage level is at 1/6 of the loudest amplitude of this musical note. This puts the loudest part of the music note only about 2.5 bits above the noise and garbage (which also means that the intrinsic DSD/SACD resolution for this musical note is only about 2.5 bits).
      Now, the aggressive pattern recognition and averaging algorithm for DSD/SACD quiets midband noise about another 14 bits (which, when added to DSD/SACD's intrinsic 6 bit resolution, yields a total midband quieting of about 20 bits). This means, when you hear a pretty, smooth classical music note from DSD/SACD, against a wonderfully quiet background, that 14 bits worth of that musical note is fictional guesswork based on repeated patterns extracted from the rubble, and only 2.5 bits worth is genuine musical reproduction which came from the input signal.
      Simply speaking (without getting into fancy math), this tells us that about 6/7 of the information in each classical music note from DSD/SACD is fabricated utter fiction, an attempt to guess at the nature of the musical information that was lost under the rubble of DSD/SACD's poor intrinsic resolution. And this 6/7 is erroneous fiction, because the guessing is based on detecting only repeating patterns and then applying that average uniformly to guessing and fabricating the entire 6/7 of musical information that was in fact lost.
      With 6/7 of the musical information in each musical note being a smoothed down averaged fabricated fiction, based on only the repeating patterns within the note and ignoring all the unique, distinctive transients within that note, it's small wonder that the signal coming out of DSD/SACD sounds so different from the signal input, and sounds so smoothed down. A typical classical music note from DSD/SACD is 6/7 fabricated fiction, not reproduction of the input signal. It is 6/7 erroneous fiction, because the guesses for the nature of that entire missing 6/7 are based on detecting only repeating patterns, then assuming that all the musical information in this missing 6/7 uniformly fits these detected repeating patterns, and then fabricating the fiction of this entire lost 6/7 from this single uniform mold.
      Let's translate this back to our river beach analogy. Suppose it was your job to accurately describe the material composition of this beach, so you could reproduce it elsewhere. Suppose 1/7 of the material in this beach were the tan sand on top, which you could easily see and accurately describe. Suppose that 6/7 of this beach were those round river rocks buried under the sand. You sift through the sand, looking for the information that lies buried. But, as you repeatedly sift through the sand, you are not able to detect the many individual brightly colored rocks (perhaps you are color blind, or your nearsighted vision has such poor resolution that you can't really see the individual rocks, and instead you can only make out the overall average grey color of the layer of rocks). All you are able to discern is the repeating grey color pattern of the rock field.
      You are now required to reproduce the material composition of this beach. To keep things simple, let's stick to colors. First, you know that you can correctly reproduce the color of 1/7 of the beach, the tan blob (of sand) on top that you can directly see. Now, what about the color of the remaining 6/7 of the beach? This lies buried underneath the sand, so you can't literally reproduce it, and you are forced to fabricate a guess. When you sifted through some sand, your limited vision only allowed you to see the repeating grey color pattern of the rock field. So that's the only basis you have, upon which you could fabricate a guess as to the description of the color of the entire 6/7 rock layer that is buried. And thus you fabricate a fictional description of the entire rock layer, 6/7 of the beach, as being uniformly grey. But of course your fictional re-creation of the beach is wrong. In fact, 6/7 of the beach includes many individual rocks with distinctive sparkling colors, which you couldn't see when you sifted through the sand, given your limited ability to only look for a repeated color pattern.
      Only 1/7 of the new beach created from your description is a true reproduction of the original beach. The remaining 6/7 of the new beach is not a reproduction at all, but merely an invention, a fabricated fictional guess. And it is an erroneous guess, being much more uniformly and smoothly grey than the original. The sparkling colors which gave vivacious distinctive character to each individual instant in the original are lost forever.
      It is precisely because DSD/SACD's intrinsic resolution is so crude, at merely 6 bits, that such a high proportion of the music (6/7 for a typical classical music note) must be fabricated as a fiction out of guesswork. Additionally, it is precisely because DSD/SACD's intrinsic resolution is so crude that its enhancing algorithm must do so much work, to make up the huge amount of ground between 2.5 bit (or 6 bit) resolution and the lofty goal of 20 bit resolution. This is precisely why the algorithm must be so aggressive in pursuing repeating patterns, and making the entire 6/7 of fabricated music look like and sound entirely like a smoothed down uniform average of a repeating pattern. If the algorithm did not have to do so much enhancing work, then it would not have to be so aggressive in seeking out repeating patterns and imposing the tyranny of the average repeating pattern upon all the music.
      If only DSD/SACD had been endowed with a higher sampling rate and thus higher intrinsic resolution at the outset, then a higher proportion of the music could have been truly reproduced from the input, instead of having to be fabricated as fiction from guesswork. Then the enhancing algorithm would not have had as much ground to make up. Then the enhancing algorithm could have been made less aggressive in pursuing repeating patterns, and less tyrannical in imposing the smoothed down uniform average of the repeating pattern on that (smaller) proportion of music it had to fabricate out of guesswork. If only.

     
back to table of contents