the original digitizing system. Thus, high power averaging truly increases resolution, even beyond the information contained in each 16 bit originally digitized sample.Sonic Benefits of Higher Sampling Rates
Is this magic? Is this high power averaging somehow creating more information? Is the law of conservation of information being violated? No. Resolution of each sample point is indeed being truly increased, beyond the 16 bits of information originally digitized. But the increase in information for this one sample point does come from somewhere in the digitized input data; it is not newly created by magic. The information for improving the true resolution of one sample point comes from many other nearby sample points. It comes from the average trend of these many other sample points. The accurate, original music waveform is already contained within, but hidden among, the statistical scatter of digitized samples which, thanks to the crude approximation of 16 bit quantization, vary from the correct waveform value in a random, noiselike way. High power averaging finds and calculates this original music waveform hidden amongst the statistical noisy scatter, by using the information contained in many sample points, to reduce the noisy quantization errors, and thereby improve the accuracy and resolution of each of the sample points. Thus, musical resolution can truly be improved (at least for middle and lower frequencies), even beyond the coarse limitations of the original 16 bit digitizing.
That's why some CD players like the Capitole, some D-A processors, and some outboard boxes like the dcs Purcell can make your 16 bit CDs sound better (as though they had the same 20-24 bit resolution as the new super digital formats) -- revealing more musical detail with higher resolution, and also sounding more musically natural, since the averaged waveform they create is truer to the original music waveform than the crudely approximate 16 bit input digitization was.
All this might seem like an impossible miracle, like the proverbial turning of straw into gold or a sow's ear into a silk purse. It seems counterintuitive to get better quality from the output than we put in at the input, to get out 20+ bits of musical accuracy and resolution when we have put in only 16 bits worth into the CD medium. But that indeed is the miracle of high power averaging. And that's why products like the Capitole with the Anagram module or the dcs Purcell are so invaluable. They can transform your huge library of 16 bit CDs into a close equivalent of 20-24 bit super format recordings, for most of the musical spectrum. That's especially useful considering the lack of available software on the new super formats. And it's priceless considering that most of the music in your CD library will never be re-archived onto a super format digital recording.
Incidentally, this miracle is achieved without violating any of nature's laws of physics (no animals were harmed). The laws of entropy are obeyed because the high power averaging achieves improved order and decreased random noise/error within the passband by trading off and allowing worse order and increased noise at very high ultrasonic frequencies outside the passband. That's why these high power averaging schemes are sometimes called noise shifting; they shift noisy random errors from within the musical spectrum to beyond the musical spectrum. To accomplish this, the computers implementing this high power averaging must run at a fast oversampled clock rate, so they can access the very high ultrasonic frequencies to use them as a dumping ground for the noise they remove from within the musical spectrum.
Some people have wondered aloud and speculated why upsampling or oversampling improves the sound of CDs in so many ways and so dramatically. It's important to note that averagers or noise shifters improve the sound you hear within the passband not by their oversampling or upsampling actions. Oversampling or upsampling in itself does not provide any extension in frequency response to higher frequencies, nor any higher resolution or accuracy to the original music waveform, nor any more natural musicality. Rather, oversampling or upsampling is merely a tool, and a required tool, which is preliminary to the work of doing the high power averaging calculations. And it is these averaging calculations, not the upsampling, which provide all the manifold sonic improvements we all hear.
We've just finished looking at one source of digital errors, the crude native resolution of most digital media and the resulting quantization errors. Now, throughout the digital chain, there are many other insidious sources of contaminating noise, error, and/or distortion. Even at the beginning of the digital chain, the analog to digital convertor, which digitizes the original microphone feed or analog music signal, contains complex circuitry that is prone to many kinds of errors (e.g. sample and hold nonlinearities, noise, glitches, jitter, etc.). All these various errors are added onto the basic quantization error of the 16 bit system (discussed above). These various added errors mean that the digital system does not even achieve the promised native resolution (of 16 bits in this case). The digital values for each sample point will deviate even further from the true music waveform, thanks to these added errors. Later links in the digital chain add further errors, for example by adding more jitter.
High power averaging can come to the rescue again, compensating for and reducing the effect of most of these added errors. It does not matter what the source of the error was, or what type of cause the error had, or where in the digital chain the error occurred. To the extent that the added errors were random, they can be viewed as simply being added noise, part of the statistical scatter that deviates from the correct music waveform curve that is hidden among the statistical scatter as its trend. And, as such, they can be compensated for and reduced by high power averaging. Thus, high power averaging can compensate for a wide assortment of sins that are endemic to most or all digital systems, throughout the digital system chain. And that's why high power averaging can help digital recordings to sound less digital, more like the original live music, and more like great analog, which of course does not commit these digital sins.
Another important source of digital errors was discussed above. The digital playback filter has the crucial responsibility of literally re-creating the music waveform from sketchy clues coming off the CD. To do this re-creation correctly, it must perform calculations based on a perfect boxcar filter function. But this perfect ideal boxcar filter function cannot be physically realized in practice. It can only be approximately approached. And therefore the digital filter will inevitably calculate an incorrect music waveform, with each data sample point deviating from what it should be by some error, especially above 2 kc. The calculations from the digital filter won't even fulfill the limited promise of 16 bit resolution, because of these added errors. But these computational filter errors will hopefully be pretty random in nature, being on both the plus and minus side of what the ideal perfect boxcar filter would have produced. And, to the extent that these errors are random, they too can be compensated for and reduced by high power averaging (applied to the output of the digital filter, after the filter has finished its calculations). In effect, the digital filter's approximations, resulting from its failure to be a perfect boxcar filter, yield a statistical scatter of points that hover in the neighborhood of the ideal music waveform that a perfect boxcar digital filter would re-create. High power averaging can discern the trend of this statistical scatter and compute the best fit curve, which will be much closer to the original music waveform. Thus, high power averaging can improve (at least for frequencies below the passband edge) the necessarily imperfect performance of digital filters in re-creating the music waveform.
By reducing all these additional digital system errors just discussed above, high power averaging can bring us yet closer to the original music waveform before it was digitized. High power averaging, by reducing these various erroneous deviations from accuracy, can reveal more of the original waveform's detail with better resolution, so it allows digital, even ordinary CDs, to sound more transparent, like a higher bit digital system. And, because high power averaging reduces noise, digital errors, and distortion, it allows digital to sound cleaner, with fewer and lesser annoying and fatiguing digital artifacts. And, because high power averaging brings us closer to the original analog music waveform picked up by the recording microphone, it allows digital to sound more naturally musical, more like the original real live event (and more like great analog). Of course, all the above finally explains why listeners report hearing precisely these sonic improvements from so called upsamplers like the dcs Purcell, from digital systems that depend on high power averaging like Sony's DSD-SACD, and from advanced CD players like the Capitole with the Anagram module. Again, oversampling or upsampling is a tool required to perform this high power averaging, but it is the high power averaging that provides all the sonic benefits you hear, not the upsampling per se.
Although a high power averaging algorithm can be almost magically beneficial, we don't want to leave you with the impression that designing one is a piece of cake. They are very complex to design, with many design choices and tradeoffs. This of course means that one engineer's averaging algorithm will surely perform and sound very differently from every other engineer's algorithm. So here once again is room for dramatic sonic differences among competing digital audio products, now by virtue of their different averaging algorithm designs. High power averaging systems should be designed with expertise, care, discriminating musical judgment, and a good dose of inspirational luck. It's much like designing a very complex amplifier and filter system, with multiple feedback loops. The designer must choose the order of averaging (how many layers), the type of feedback loops used (involving complexities similar to those facing the designer of an amplifier feedback loop), the frequency profile of the averaging (how much he attempts at each frequency), gain and phase and overload parameters, and the nature of the eventual rolloffs at very high frequencies (which can affect transient response, stability, overload, etc). There are new kinds of technical internal design challenges which need to be carefully engineered, with judicious balancing of tradeoffs. Poor engineering or injudicious tradeoff choices can make the filter itself introduce new problems, with new adverse sonic consequences (e.g. problems such as overload clipping or instability from high order feedback).
Generally, the higher order and more powerful averaging systems are more difficult to engineer correctly, and it's trickier to avoid the many pitfalls, than with simpler, lower order averaging systems. But talent can compensate for hurdles. It's useful to compare some examples.
In the old days, most engineers figured you couldn't go any higher than 2nd order averaging, because 3rd order averaging would introduce stability problems, thereby corrupting the very music signal it was trying to help. Then Vimak successfully engineered a 3rd order averaging algorithm that helped their CD player to sound very special in its day.
But even 3rd order averaging wouldn't be enough for Sony, who faces the daunting task of converting their 1 bit DSD system (with 8 bits effective resolution) all the way up to 20 bits of musical resolution (at least for music's middle and lower frequencies). To perform his giant conversion leap, Sony needs more powerful help from an averaging algorithm. So they have been forced to design a very aggressive high power averaging algorithm, reportedly with 5th order (multi-layer) averaging. However, their design choices leave some sonic problems at music's high frequencies (high noise, fuzzy defocus, and veiling) because their averaging algorithm runs out of steam there. And their design choices also create new sonic problems (a distorted, noisy smearing of some high frequency sounds having a lot of energy), which we think are caused by poor design tradeoff choices in the curve of the averaging algorithm (Martin Colloms measured a huge noise peak around 53 kc, which could intermodulate with high frequency, high energy music to create audible dirty bursts of modulation noise that vary with the music and thus sonically smear and dirty the music). See IAR's other articles on DSD-SACD for further discussion.
Does this means that 5th order is impossibly high goal for any averaging system, and so we must all set our sights lower? No. In fact, the folks at dcs set their sights even higher. They engineered a very complex 9th order for their Purcell, and we think they must have done a very good job with this design, as evidenced by the musical naturalness of their sonic results (with clean, delicate high frequencies). Obviously dcs did a better engineering job than Sony, and on a more complex averaging algorithm at that. In fairness to Sony, it's possible that trying to work with a 1 bit system imposes constraints that dcs does not face working with their multibit system. But, if this is the case, Sony should not be touting 1 bit systems as superior to multibit systems.
Anagram is being close mouthed about the design of their module, employed in the Capitole, but its sonically obvious that their design choices do a superb job of extracting more musical inner detail out of 16/44.1 CDs than one would have thought possible, especially at middle and lower frequencies. At music's high frequencies, the Anagram algorithm sounds very different than the dcs, being harder and more articulate rather than delicate and feathery as the dcs sounds. You can pick which sound you prefer. In any case, this sonic difference further demonstrates that digital filters and averaging algorithms are responsible for literally generating the music waveform you hear, and different designs will re-create different sounding music, from the same sketchy clues coming off the CD that we call sample data.
For what's it's worth, the information that Anagram does choose to reveal about the workings of its wondrous module is available on its website at anagramtech.com. The Anagram information suggests that their signal manipulation uses very sophisticated math and complex calculations in some unique ways, including re-sampling into a continuous time domain rather than the usual discrete time domain. The Anagram software driving their module, as developed for Audio Aero, is called STARS ® (Solution for Time Abstraction Re Sampling) processing, a combination of very high speed re-sampling, interpolation and signal enhancement techniques. This STARS software is unique and exclusive to Audio Aero, and therefore different from the software used for example by Camelot or any other manufacturer using the Anagram ATF192 module (main differences are in the jitter software, the averaging filter parameters, and the oversampling, which is 1024 times for STARS vs. 64 times for ATF192).
So far, we've concentrated on 16/44.1 digital recordings, the majority of your CD library, to show you how remarkable sonic improvements are possible when the wizardy of high power averaging is added to your playback chain. And we've seen that oversampling or upsampling per se is not the responsible agent for these sonic improvements. Now, let's look at another kind of sampling rate increase that does seem to provide sonic improvements. In the above discussion the clock rate was increased after the original analog music signal had already been digitized (at 44.1 or 48 kc), and increasing the clock rate at this late stage simply involved looking at the same input sample repetitively, say four times for four times oversampling. But what if we employ a faster clock from the very beginning, actually digitize the original analog music signal at a higher sampling rate, and then keep this higher sampling rate through the digital chain (including the storage medium that transfers the recording to the consumer)? For example, we could digitize and record music at a 96 kc sampling rate instead of the standard 48 kc sampling rate, doubling our sampling rate and thereby also doubling the extension of our musical passband, from 20 kc to 40 kc.
Listeners report hearing dramatic sonic improvements from digital recordings made at a higher sampling rate, such as 96 kc, compared to those made at the standard 44.1 or 48 kc. Curiously, listeners report similar types of sonic improvements as discussed above for high power averaging: better detail and transparency, cleaner sound, and more musically natural sound. And, curiously as well, they also report hearing all these improvements throughout the spectrum, not just in the extreme treble where one would expect it (from extending the passband, to 40 kc instead of 20 kc).
Why does doubling the recording sampling rate provide all these unexpected sonic improvements? People have wondered aloud why doubling the professional recording sample rate should have the sonic benefits it does. The natural temptation is to focus all one's analytical attention on the primary effect that doubling the sample rate achieves: it extends the digital system's high frequency response, from 20 kc to 40 kc. And then this thinking sends people off on wild goose chases of speculation, as they wonder out loud about our apparent ability to hear improvements in the ultrasonic 20 kc to 40 kc range.
Some wondering speculation has centered on trying to figure out why extending the musical bandwidth in the ultrasonic region from 20 kc up to 40 kc should even be audible at all. People have speculated about our sensitivity to effects of ultrasonic frequency reproduction, even running tests with special supertweeters. But whatever the answer to our ultrasonic perception capabilities, this speculation doesn't even address the crucial question of why extending the passband from 20 kc to 40 kc could have such audible benefits in improving other unrelated sonic factors (transparency, detail, and natural musicality), most remarkably at middle musical frequencies that are far removed from the ultrasonic spectral extension being provided.
Other speculators have guessed that improvements in this 20 kc to 40 kc range, even if not directly audible, might nevertheless cause other effects that are audible, and they have speculated what the linkage might be.
For example, some have speculated that different ultrasonic response might provoke different IM distortion behavior in amplifiers, changing difference IM distortion byproducts that are down below 20 kc and thus within the audible range. This speculation is rubbish because the sonic improvements from doubling the digitizing sampling rate are too dramatic and too broadband to be caused by today's best amplifiers (which are too perfect to cause such drastic sonic differences from IM distortion), and because these same sonic improvements are consistently heard regardless of which amplifier is employed.
Another example is speculation that the ringing pattern of the playback boxcar filter does not last as long in time when the frequency of the boxcar corner is raised (which it can be if the original sampling rate is raised) - and thus there is less obscuring time smearing of the music by the ringing
(Continued on page 28)