Title of Invention	ADAPTIVE RESIDUAL AUDIO CODING
Abstract	This invention relates to audio encoder(lO) for encoding an audio signal having at least two channels (18) comprising a parameter extractor (16) for deriving a coherence parameter (ICC) describing a coherence between a first channel and a second channel of the at least two channels and a level parameter (IID) describing a level difference between the first channel and the second channel as spatial parameters; a limiter (14) for limiting the coherence parameter to derive a limited coherence parameter, wherein the limit of the coherence parameter depends on the level parameter and on a scaling factor; a down-mixer (12) for deriving a downmix signal (20) and a residual signal (18) from the audio signal using a down-mixing rule depending on the limited coherence parameter.

Title of Invention

ADAPTIVE RESIDUAL AUDIO CODING

Abstract

This invention relates to audio encoder(lO) for encoding an audio signal having at least two channels (18) comprising a parameter extractor (16) for deriving a coherence parameter (ICC) describing a coherence between a first channel and a second channel of the at least two channels and a level parameter (IID) describing a level difference between the first channel and the second channel as spatial parameters; a limiter (14) for limiting the coherence parameter to derive a limited coherence parameter, wherein the limit of the coherence parameter depends on the level parameter and on a scaling factor; a down-mixer (12) for deriving a downmix signal (20) and a residual signal (18) from the audio signal using a down-mixing rule depending on the limited coherence parameter.

Full Text	ADAPTIVE RESIDUAL AUDIO CODING Field of the invention The present invention relates to the encoding and decoding of audio signals and in particular to the efficient high-quality coding of a pair of audio channels. Background of the invention prior art Recently, effective high-quality coding of audio signals has become more and more important, as digital distribution of compressed audio and video content, e.g. by satellite or by terrestrial digital audio- or video-broadcasting is widely used. The well-known MP3 technique, for example, al¬lows for convenient transmission of audio titles over the internet or other transmission channels having limited bandwidths. In addition to MP3, several other audio coding schemes aim to maximize the audio quality for a given compression ratio or bit rate. It has been shown in "Efficient and scalable Parametric Stereo Coding for Low Bit rate Audio Coding Ap¬plications", PCT/SE02/01372, that it is possible to recre¬ate a stereo signal that closely resembles the underlying original stereo image, from a mono signal when additionally a very compact representation of the stereo signal commonly referred to as "spatial cues" is used. The disclosed prin¬ciple is to divide the stereo input signal into frequency bands and to estimate parameters called inter-channel in¬tensity difference (IID) and inter-channel coherence (ICC) for each of the frequency bands separately. The first pa-rameter describes a measurement of the power distribution between the two channels in the specific frequency band and the second parameter describes an estimation of the corre¬lation between the two channels. A more thorough descrip¬tion of spatial parameters may be found in "High-quality parametric spatial audio coding at low bit rates" J. Breebaart, S. van de Par, A. Kohlrausch and E. Schuijers, Proc. 116th AES Convention, Berlin (Germany), May 8-11, 2004. Based on these spatial cues, the stereo in¬put signal is adaptively combined into a mono signal. Both the spatial cues and the mono signal are coded and the coded representation is multiplexed into a bit-stream, that is transmitted to the decoder. On the decoder side the ste¬reo image is recreated from the mono signal by distributing the energy of the mono signal between the two output chan¬nels in accordance with the IID-data, and by adding a decorrelated signal in order to retain the channel correla¬tion of the original stereo channels, as it is described by the IIC parameters. When more transmission bandwidth is available, a higher au¬dio quality can be achieved by replacing the decorrelated mono-signal in the decoder by a transmitted residual sig¬nal. That is, the transmission of an additional residual signal to a decoder is required. This is also the case with mid-side (MS) coding, where the sum and the difference of the channels of a stereo signal are coded rather than the left and right channels directly. A description of the MS technique may be found in "Sum-difference stereo transform coding", Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), San Francisco, USA, 1992, pp. II 569 - 572. MS coding is based on the finding, that the left and the right channel of a stereo signal are being rather similar with a high probability. Therefore, a difference of the left and the right channel will yield a signal having a compara¬tively low intensity most of the time, i.e. the amplitude of the difference signal will be rather small. Hence, one can save a significant amount of bit rate when encoding the difference signal, since the parameters describing the dif¬ference signal can be coarsely quantized. The sum signal will evidently need about the same bandwidth than a single left or right channel, when encoded. Therefore, one can save a significant amount of bandwidth in total when using the MS coding scheme. When a large intensity difference be- tween the left and the right channel exists, the MS tech¬nique has its limits, since then also the difference chan¬nel will contain a substantial amount of energy and there¬fore needs a higher bandwidth. It may be noted, however, that in regular stereo-coded implementations, MS coding will not be applied in this case, due to high encoding costs. In those cases, it is advantageous to have the pos¬sibility to switch between normal stereo coding and MS cod¬ing, depending on the intensity carried by the original au¬dio channels that have to be encoded. By replacing the static concept of building the sum and the difference of two stereo channels that are to be encoded by inventing a decoder rotator matrix with matrix elements that describe the composition of two intermediate channels that are a combination of the two stereo channels, one can overcome the above problem. The matrix elements are depend¬ing on parametric stereo parameters that are extracted from the left and the right channel of the stereo signal. Adap¬tive residual coding is such able to dynamically adapt the combination rule for the generation of intermediate chan¬nels to the properties of the present signal, achieving a significant performance gain over MS coding. Choosing a suited dependency of the matrix elements of the so-called rotator matrix from the parametric stereo parame¬ters, one can achieve that the energy within a difference channel stays as minimal as possible, as shown already within the non-disclosed European patent application EP 04103168.3. As one introduces a rotator matrix to trans¬form (downmix or up-mix) the stereo signal to signals m and s (the intermediate signals, i.e. the downmix signal m and residual-signal s) , it is crucial for the operation of the method that the rotator matrices (the decoder rotator ma¬trix and the encoder rotator matrix) are bounded. This means that the matrix elements within the matrices do not diverge to infinity within the entire range of parametric stereo coding parameters possible. In other words, both ro- tator matrices have to be bounded in the sense that the ma¬trix condition number is sufficiently small to allow prob¬lem-free matrix inversion for the entire range of paramet¬ric stereo coding parameters, which is not the case for im¬plementations according to prior art techniques. Summary of the invention It is the object of the present invention to provide a con¬cept for high quality audio coding yielding a highly com¬pressed representation of an audio signal simultaneously avoiding artefacts introduced by the coding or decoding more efficiently. According to a first aspect of the present invention, this object is achieved by an audio encoder for encoding an au¬dio signal having at least two channels, comprising: a pa¬rameter extractor for deriving a spatial parameter from the audio signal, wherein the spatial parameter describes an interrelation between the at least two channels; a limiter for limiting the spatial parameter using a limiting rule to derive a limited spatial parameter, wherein the limiting rule depends on an interrelation between the at least two channels; and a down-mixer for deriving a downmix signal and a residual signal from the audio signal using a down-mixing rule depending on the limited spatial parameter. According to a second aspect of the present invention, this object is achieved by an audio decoder for decoding an en¬coded audio signal representing an original audio signal having at least two channels, the encoded audio signal hav¬ing a downmix signal, a residual signal and a spatial pa¬rameter describing an interrelation between the at least two channels, comprising: a limiter for limiting the spatial parameter to derive a limited spatial parame¬ter using a limiting rule, wherein the limiting rule de¬pends on an interrelation between the at least two chan¬nels; and an up-mixer for deriving a reconstruction of the original audio signal from the downmix signal and the re¬sidual signal using an up-mixing rule depending on the lim¬ited spatial parameter. According to a third aspect of the present invention, this object is achieved by a method for encoding an audio signal having at least two channels, the method comprising: deriv¬ing a spatial parameter from the audio signal, wherein the spatial parameter describes an interrelation between the at least two channels; limiting the spatial parameter using a limiting rule to derive a limited spatial parameter, wherein the limiting rule depends on an interrelation be¬tween the at least two channels; and deriving a downmix signal and a residual signal from the audio signal using a down-mixing rule depending on the limited spatial parame¬ter. According to a fourth aspect of the present invention, this object is achieved by a method for decoding an encoded au¬dio signal representing an original audio signal having at least two channels, the encoded audio signal having a down-mix signal, a residual signal and a spatial parameter de¬scribing an interrelation between the at least two chan¬nels, the method comprising: limiting the spatial parame¬ter to derive a limited spatial parameter using a limiting rule, wherein the limiting rule depends on an interrelation between the at least two channels; and deriving a recon¬struction of the original audio signal from the downmix signal and the residual signal using an up-mixing rule de¬pending on the limited spatial parameter. According to a fifth aspect of the present invention, this object is achieved by a transmitter or audio recorder hav¬ing an audio encoder for encoding an audio signal having at least two channels, comprising: a parameter extractor for deriving a spatial parameter from the audio signal, wherein the spatial parameter describes an interrelation between the at least two channels; a limiter for limiting the spa- tial parameter using a limiting rule to derive a limited spatial parameter, wherein the limiting rule depends on an interrelation between the at least two channels; and a down-mixer for deriving a downmix signal and a residual signal from the audio signal using a down-mixing rule de¬pending on the limited spatial parameter. According to a sixth aspect of the present invention, this object is achieved by a receiver or audio player, having an audio decoder for decoding an encoded audio signal rep¬resenting an original audio signal having at least two channels, the encoded audio signal having a downmix signal, a residual signal and a spatial parameter describing an in¬terrelation between the at least two channels, comprising: a limiter for limiting the spatial parameter to derive a limited spatial parameter using a limiting rule, wherein the limiting rule depends on an interrelation between the at least two channels; and an up-mixer for deriving a reconstruction of the original audio signal from the down-mix signal and the residual signal using an up-mixing rule depending on the limited spatial parameter. According to a seventh aspect of the present invention, this object is achieved by a method of transmitting or au¬dio recording the method having a method of generating an encoded signal, the method comprising a method for encoding an audio signal having at least two channels, the method comprising: deriving a spatial parameter from the audio signal, wherein the spatial parameter describes an interre¬lation between the at least two channels; limiting the spatial parameter using a limiting rule to derive a limited spatial parameter, wherein the limiting rule depends on an interrelation between the at least two channels; deriving a downmix signal and a residual signal from the audio signal using a down-mixing rule depending on the limited spatial parameter. According to an eighth aspect of the present invention, this object is achieved by a method of receiving or audio playing, the method having a method for decoding an encoded audio signal, the method comprising a method for decoding an encoded audio signal representing an original audio sig¬nal having at least two channels, the encoded audio signal having a downmix signal, a residual signal and a spatial parameter describing an interrelation between the at least two channels, the method comprising: limiting the spatial parameter to derive a limited spatial parameter using a limiting rule, wherein the limiting rule depends on an in-terrelation between the at least two channels; and deriving a reconstruction of the original audio signal from the downmix signal and the residual signal using an up-mixing rule depending on the limited spatial parameter. According to a nineth aspect of the present invention, this object is achieved by a transmission system having a transmitter and a receiver, the transmitter having an audio encoder for encoding an audio signal having at least two channels, comprising: a parameter extractor for deriving a spatial parameter from the audio signal, wherein the spa¬tial parameter describes an interrelation between the at least two channels; a limiter for limiting the spatial pa¬rameter using a limiting rule to derive a limited spatial parameter, wherein the limiting rule depends on an interre¬lation between the at least two channels; and a down-mixer for deriving a downmix signal and a residual signal from the audio signal using a down-mixing rule depending on the limited spatial parameter; and the receiver having an audio decoder for decoding an encoded audio signal representing an original audio signal having at least two channels, the encoded audio signal having a downmix signal, a residual signal and a spatial parameter describing an interrelation between the at least two channels, comprising: a limiter for limiting the spatial parameter to derive a limited spa¬tial parameter using a limiting rule, wherein the limiting rule depends on an interrelation between the at least two channels; and an up-mixer for deriving a reconstruction of the original audio signal from the downmix signal and the residual signal using an up-mixing rule depending on the limited spatial parameter. According to a tenth aspect of the present invention, this object is achieved by a method of transmitting and receiv¬ing, the method including a transmitting method having a method of generating an encoded signal of an audio signal having at least two channels, the method comprising: deriv¬ing a spatial parameter from the audio signal, wherein the spatial parameter describes an interrelation between the at least two channels; limiting the spatial parameter using a limiting rule to derive a limited spatial parameter, wherein the limiting rule depends on an interrelation be¬tween the at least two channels; and deriving a downmix signal and a residual signal from the audio signal using a down-mixing rule depending on the limited spatial parame¬ter; and a receiving method, having a method for decoding an encoded audio signal, the method comprising: limiting the spatial parameter to derive a limited spatial parame¬ter using a limiting rule, wherein the limiting rule de¬pends on an interrelation between the at least two chan¬nels; and deriving a reconstruction of the original audio signal from the downmix signal and the residual signal us¬ing an up-mixing rule depending on the limited spatial pa¬rameter. According to an eleventh aspect of the present invention, this object is achieved by an encoded audio signal being a representation of an audio signal having at least two chan-nels, the encoded audio signal having a spatial parame-ter describing an interrelation between the at least two channels, a downmix signal and a residual signal, wherein the downmix signal and the residual signal are derived from the audio signal using a down-mixing rule depending on a limited spatial parameter derived using a limiting rule de¬pending on an interrelation of the at least two channels. The present invention is based on the finding that an audio signal having at least two channels can be efficiently down-mixed into a downmix signal and a residual signal, when the down-mixing rule used depends on a spatial parame¬ter that is derived from the audio signal and that is post-processed by a limiter to apply a certain limit to the de¬rived spatial parameter with the aim of avoiding instabili¬ties during the up-mixing or down-mixing process. By having a down-mixing rule that dynamically depends on parameters describing an interrelation between the audio channels, one can assure that the energy within the down-mixed residual signal is as minimal as possible, which is advantageous in the view of coding efficiency. By post processing the spa¬tial parameter with a limiter prior to using it in the down-mixing, one can avoid instabilities in the down- or up-mixing, which otherwise could result in a disturbance of the spatial perception of the encoded or decoded audio sig¬nal. In one embodiment of the present invention, an original stereo signal having a left and a right channel is supplied to a down-mixer and a parameter extractor. The parameter extractor derives the commonly known spatial parameters ICC (Inter-Channel-Correlation) and IID (Inter-Channel-Inten-sity-Difference). The down-mixer is able to downmix the left and right channels into a downmix signal and a resid¬ual signal, wherein the down-mixing rule is such that the resulting residual signal carries minimum achievable en¬ergy. Therefore, subsequent compression of the resulting residual signal by a standard audio encoder will result in an extremely compact code. This can be achieved by formu¬lating the down-mixing rule in dependence of the spatial parameters ICC and IID, since both of the parameters are describing intensity- or amplitude ratios of the original stereo channels. A general problem during encoding is en¬ergy preservation. It is necessary that both the original signal and the encoded signal contain the same energy, since a violation of the energy conservation would result in a different loudness perception of the encoded signals or even in uncontrollable jumps in the loudness of the en¬coded signal. Therefore, in the above encoding scheme the downmix signal and the residual signal have to be scaled by a scaling factor that ensures the energy conservation rule. If the original audio signal that is to be encoded has spe¬cial properties, this scaling factor can diverge, in par¬ticular when the left and the right original channel are perfectly anti-correlated, i.e. when they have the same am¬plitudes and a phase shift of precisely 180 °. This insta¬bility is avoided within the inventive concept by applying a limiting function to the ICC parameter, wherein the lim¬iting function depends on a maximum acceptable scaling fac¬tor and the IID parameter. To avoid a possible divergence, the rule that describes the down mixing is altered di¬rectly, whereas in state of the art implementations the scaling factor is simply limited by setting a threshold and where the scaling factor is replaced by the threshold value when exceeding the threshold. It is a big advantage of the inventive concept, that both the signal within the downmix channel and the residual channel is altered through altering the parameters that are underlying the down-mixing process. Only the signal in the downmix channel would be influenced when applying a thresh¬old according to prior art, thus a better preservation of the inter-relation between the original left and right channel can be achieved when following the inventive con¬cept. Another advantage of the concept described above is, that the spatial parameters used are generally derived during an encoding process. Therefore one can implement the necessary limiting logic without having to introduce new parameters. In a further embodiment of the present invention a limiter is applied at the decoder side, having the same limiting rule than a limiter on the encoder side. This means that on the decoder side, the downmix and the residual signal as well as the spatial parameters IID and ICC are received, and the received spatial parameters are limited using the same limiting rule used during the encoding process. The upmixing is then dependent on the limited spatial parame¬ters, assuring for a non-occurring divergence in the up-mixing process. The advantage of having the same limiting rules in the encoding and the decoding is obvious, since one only has to develop hardware circuits or an implementa¬tion of a software algorithm once. Hard- or Software having as well encoding as decoding functionality, can be devel-oped at lower costs, since one is able to reuse the same hard- or software for the limiting functionality. In a further embodiment of the present invention, the down-mixed signals and the spatial parameters are compressed af¬ter their generation, yielding two audio bit streams for the down-mixed signals and a parameter bit stream holding the compressed spatial parameters. This reduces the size of the encoded representation to be transmitted, further sav¬ing bandwidth, wherein the encoding may be lossy or loss¬less, since the encoding rule itself is independent of the inventive concept. An inventive decoder according to the inventive concept then comprises a decompression stage, where the compressed representations are decompressed into the spatial parameters, the down-mixed channel and the re-sidual channel prior to up-mixing. In another embodiment of the present invention, the already compressed audio bit streams and the parameter bit stream are combined into a combined bit stream, e.g. by multiplex¬ing, allowing for a convenient storage of a generated file on a storage medium. This also allows for streaming appli¬cations, for example, streaming the encoded content via the internet, since all the relevant information is comprised in one single file or bit stream, allowing for a more con-venient handling than in a case, where three separate bit streams would be transferred. The corresponding inventive decoder then has a decombination stage, which could for ex-ample be a demultiplexer to decombine the bit stream into three separate bit streams, namely the two audio bit streams and the parameter bit stream. It is to be noted here that the inventive concept provides a perfect backward-compatibility to prior art residual cod¬ing, where the spatial parameters are not limited and even to prior art parametric stereo coding, where a decoder does not make use of the residual signal. This is of course a major advantage, since newly encoded audio data can be re¬produced with maximum possible quality by inventive decod¬ers, whereas it may also be reproduced already existing de¬coders according to prior art. In a further embodiment of the present invention, three in-ventive encoders are combined to encode a multi-channel au¬dio signal comprising six individual channels, wherein each of the three inventive encoders encodes a pair of channels, deriving spatial parameters, a downmix and a residual sig¬nal for each of the channel pairs. The inventive concept can thereby also be used to encode multi-channel audio sig¬nals where the efficiency of the coding and the compactness of the resulting representation has an even higher prior¬ity, since the total amount of data to be encoded and transmitted is much higher than for a stereo signal. In principle, an arbitrary number of inventive audio encoders can be combined to simultaneously encode a multi-channel audio signal having basically any number of single audio channels. In a further embodiment of the multi-channel au¬dio encoder, the individual downmix signals and residual signals as well as the individual parameter bit streams are combined by a 3 to 2 down-mixer to receive a common left signal, a common right signal, and a common residual signal and a combined parameter bit stream, further reducing the amount of required bandwidth. The corresponding decoders straightforwardly comprise a 2 to 3 up-mixer stage then. In another embodiment of the present invention, a transmit¬ter or audio recorder is comprising an inventive encoder, allowing for compact, high-quality audio recording or transmitting, wherein the size of the transmitted or stored audio content can be significantly reduced. Such audio con¬tent can be stored on a storage medium of a given capacity or less bandwidth is used during transmission of the audio signal. In another embodiment a receiver or audio player is having an inventive decoder, allowing for streaming applications in limited bandwidth environments such as mobile phones or allowing for construction of small portable play-back de¬vices, using storage media of limited capacity. A combination of an inventive transmitter and receiver yields a transmission system, allowing conveniently trans-mitting audio content via wired or wireless transmission interfaces, such as wireless LAN, Bluetooth, wired LAN, power line technologies, radio transmission, or any other type of data transmission. Preferred embodiments of the present invention are subse-quently described by referring to the enclosed drawings, wherein: Fig. 1 shows a block diagram of an inventive encoder; Fig. 2 shows a block diagram of the inventive encoding principle; Fig. 3 shows another embodiment of an inventive encoder; Fig. 4 shows the backwards compatibility of the inven¬tive encoding scheme to prior art decoders; Fig. 5 shows an inventive multi-channel audio encoder; Fig. 6 shows a block diagram of an inventive audio de-coder; Fig. 7 shows a block diagram of the inventive decoding concept; Fig. 8 shows a further embodiment of an inventive de-coder; Fig. 9 shows an embodiment of an inventive multi-channel audio decoder; Fig. 10 shows an alternative embodiment of an inventive audio encoder; Fig. 11 shows an alternative embodiment of an inventive audio decoder; Fig. 12 shows an inventive transmitter/audio-recorder; Fig. 13 shows an inventive receiver/audio-player; Fig. 14 shows an inventive transmission system. Detailed description of preferred embodiments Fig. 1 shows a block diagram of an inventive audio en-coder 10, comprising a down-mixer 12, a limiter 14, and a parameter extractor 16. A stereo signal 18, having a left and a right channel, is input into the down-mixer 12 and into the parameter extrac¬tor 16 simultaneously. The parameter extractor 16 extracts spatial parameters 19 describing an interrelation between the left and the right channel of the stereo signal 18. These parameters are on the one hand made available for transmission and on the other hand input into the lim-iter 14. The limiter 14 applies a limiting rule to the pa¬rameters. The details of an appropriate limiting rule shall be derived in the following paragraphs. The limiter derives limited spatial parameters and these are input into the down-mixer 12, wherein the down-mixer 12 applies a down-mixing rule to the left and right channel of the stereo signal 18 to derive a downmix signal 20 and a residual signal 22 from the left and the right channel of the stereo signal. The down-mixing rule is additionally de¬pending on the limited spatial parameter. When choosing an appropriate limiting rule for the limiter, the down-mixer 12 is only supplied with limited parameters that are limited in a way that the down-mixing rule does not diverge or produce any output that is deteriorating a spatial interrelation of the left and the right channel be¬cause of the down-mixing. As a result, the stereo signal 18 is represented by the downmix signal 20, the residual signal 22, and the spatial parameters 19 after the encoding process performed by the audio encoder 10. To understand how a down-mixing rule and a limiting rule have to interrelate to provide a resulting residual sig-nal 22 containing minimal feasible energy while simultane¬ously limiting a spatial parameter such that the down-mixing rule does not cause any divergences, the basic con¬cept underlying the present invention is elaborated in more detail in the following few paragraphs. The parameters extracted by the parameter extractor 16 typically result from a single time and frequency interval of sub-band samples from a complex modulated filter bank analysis of discrete time signals. That means that the au¬dio signal of the left and right channel of the stereo sig¬nal 18 is first divided into time frames of a given length, and within a single time frame, the frequency spectrum is sub-divided into a number of sub-band samples. For each single sub-band, the parameter extractor 16 then derives a spatial parameter by comparing the left and right channels of the stereo signal within the sub-band of interest. Therefore, the left and the right channel of the stereo signal 18 and the downmix signal m and the residual sig¬nal s from Fig. 1 have to be understood as discrete and fi¬nite length vectors, describing the underlying signals within a discrete time interval. As mentioned above, during a down-mixing, energy preservation must be assured. For discrete complex vectors x, y, the complex inner product and squared norm (comparable to energy) is defined by Following the normal convention, a * denotes complex conju¬gation. From here on, upper case letters describe the squared sum or energy, of the corresponding finite length complex vectors denoted by lower case letters. According to the present invention, the downmix channel m resulting from the adaptive downmix is the energy weighted sum of the original left and right channel, and thus de¬fined by where g is a real and positive gain factor adjusted such that the energy of the downmix (M) equals the sum of ener¬gies of the left (L) and (R) channel signal vectors (M = L + R) . As this gain factor diverges to infinity when 1 and r are out of phase and have comparable energy (i.e. l+r=0 in equation No. 2), it is necessary to limit this factor by a maximal gain factor g0 that is typically within the inter¬val [1,2]. The parameter extractor 16, as shown in Fig. 1, extracts the spatial audio parameters IID (Interchannel In¬tensity Difference) and ICC (Interchannel Coherence) that are represented here by Here, c denotes the IID-parameter and p denotes the ICC-parameter. The gain factor g can be expressed depending on the ICC and IID parameters and such the required limitation of the gain factor can be written as follows: To achieve maximum coding efficiency, it is desired that the energy within the residual signal 22 is minimal. The following derivation solves a more general optimization problem comprising an additional residual signal t, which then turns out to be superfluous due to (9) . Considering the problem from the decoder side, one needs to determine gains a, b, such that the residual signals s, t in the up-mix have minimal energy. The solution is given by where The same problem, with the additional restriction that the coefficients a,b are real, has the solution given by taking the real part of (7) and inserting it in (6). In this case, p can be expressed in terms of the PS parameters c,p, as follows: By inserting (6) into (5) and adding the two equations in (5) it follows that: Describing the up-mixing process in the usual matrix nota¬tion, the up mixing can be represented by a rotator matrix H as follows: In the case where g is not limited by g0 in (4), a differ¬ent representation of the optimal coefficients a, b is given by: The first column of the rotator matrix H is identical to the amplitude rotator used in parametric stereo, that is for example derived in WO 03/090206 Al. The downmix needs to be compatible with the up mix in the sense that perfect reconstruction is obtained when all lossy coding steps are omitted. As a consequence the down-mixing matrix D, must be the inverse of the upmix rotator H. An elementary computation yields where the first row is consistent with (2). There is a stability problem with the two optimal rotators given by (10) and (13). As (c,p) approaches (1,-1), the value of p given by (8) diverges. Therefore one has to de¬viate from the optimal rotators in a neighborhood of this point of the PS parameter domain. The solution taught by the present invention is to modify the PS parameters by an instability limiter both in the encoder and in the decoder. In its general form, such a limiter will alter the values of the pair (c,p) in a neighborhood of (1,-1) in order to achieve a bounded range for p . A particularly attractive solution is based on the observation that the denominator of (8) is the same as that of (4) . The inventive solution keeps c unaltered and modifies p exactly when the adaptive downmix gain g is limited by g0 in (4) . This occurs when The preferred modification of p performed by the instabil¬ity limiter 14 is then: The corresponding value of p given by inserting pin place of pin (8) has the property that In the previous paragraphs, the problem analysis leading to the definition of the limiter 14 has been detailed. Al¬though the notation is based on stereo signals, it is clear that the same method can be applied on any pair of audio signals, such as channel pairs selected from or generated by a partial downmix of a multi-channel audio signal. Par¬ticularly advantageous is, that the same limiting rule can be used to limit the parameters within the up-mixing and the down-mixing matrix. Fig. 2 describes the inventive audio encoding procedure us¬ing a block diagram, showing how the audio encoding is per¬formed when following the inventive concept. In a first pa¬rameter extraction step 30, the ICC and IID parameters are derived. These parameters are then forwarded as output 23 and trans-ferred to serve as input for the limiting step 32, where a comparison of the ICC parameter with a computed minimal ICC parameter ICCmin is made, wherein ICCmin is depending on IID. In a first case, where the ICC parameter excedes the mini¬mum ICC parameter ICCmin (IID) , the ICC parameter is directly forwarded to the down-mixing step 34. If the ICC parameter does not exceed ICCmin(HD), an addi¬tional exchange step 36 is performed, where the value of the ICC parameter is replaced by the value of the minimal ICC parameter ICCmin(HD). After the exchange step 36, the ICC parameter having the new value is then transferred to the down-mixing step 34. In the down-mixing step 34 the downmix signal 20 and the residual signal 22 are derived from the channels 1 and r, depending on the parameters ICC and IID. Finally the parameters 23 (ICC and IID) , the downmix sig¬nal 20 and the residual signal 22 are available as output of the encoding procedure. Fig. 3 shows another embodiment of an inventive audio en¬coding device 50 that comprises an audio encoder 10, a sig¬nal processing unit 51 having a first audio compressor 52, a second audio compressor 54, and a parameter compres¬sor 56, and an output interface 58. The components of the audio encoder 10 have already been discussed in the previous paragraphs. Therefore, only those parts of the audio encoding device 50 that are extending the audio encoder 10 will be discussed in the following paragraphs. The general purpose of the signal processing unit 51 is to compress the downmix signal 20, the residual signal 22 and the parameters 23. Therefore, the downmix signal 20 is in¬put into the first audio compressor 52, the residual sig¬nal 22 is input into the second audio compressor 54 and the spatial parameters 23 are input into the parameter compres¬sor 56. The first audio compressor 52 derives a first audio bit stream 60, the second audio compressor 54 derives a second audio bit stream 62 and the parameter compressor 56 derives a parameter bit stream 64. The first and the second audio bit stream (60, 62) and the parameter bit stream 64 are then used as input of the output interface, that com¬bines the three bit streams (60, 62, 64) to derive a com¬bined bit stream 66, which is the output of the inventive encoding device 50. The combination performed by the output interface 58 could for example be a simple multiplexing of the three incoming bit streams. Furthermore, any kind of combination that leads to a single output bit stream 66 is possible. Dealing with a single bit stream is much more convenient in han¬dling, such as streaming via the internet or other data links. In other words, Figure 3 illustrates an encoder that takes a two-channel audio signal, comprising the channels 1, r as input and generates a bitstream that permits decoding by a parametric stereo decoder. The adaptive downmix takes the two-channel signal 1, r and generates a mono downmix m and a residual signal s. These signals can then be encoded by perceptual audio encoders to produce compact audio bit-streams. The parametric stereo (PS) parameter estimation takes the two-channel signal 1, r as input and generates a set of PS parameters. The instability limiter modifies the PS parameters which control the adaptive downmix. The en-coding block produces the parametric stereo side informa¬tion (PS sideinfo) from the unmodified output of the PS pa¬rameter estimation. The multiplexer combines all encoded data to form the combined bitstream. It is one of the major advantages of the inventive coding concept, that it is fully backwards compatible to prior art parametric stereo decoders. To illustrate this, Fig. 4 shows a prior art parametric stereo decoder. The parametric stereo decoder 70 comprises an input inter¬face 72, an audio decoder 74, a parameter decoder 7 6, and an up-mixer 78. The input interface 72 receives a combined bit stream 80 as produced from by inventive audio encoder 50. The input in¬terface 72 of the prior art parametric stereo decoder 70 does not recognize the residual signal 22 and therefore only extracts the downmix signal 60 (first audio bit stream 60 from Fig. 3) and the parameter bit stream 64 from the input bit stream 80. The audio decoder 74 is the com¬plementary device to the first audio compressor 52 and the parameter decoder 76 is the complementary device to the pa¬rameter compressor 56. Therefore, the audio bit stream 60 is decoded into the downmix signal 20 and the parameter bit stream 64 is decoded to the spatial parameters 23. Since the spatial parameters 23 have been directly transferred and not been further processed by the inventive encoder 10 or 50, a prior art up-mixer 78 can reconstruct a left and a right channel, building an output signal 80 from the down-mix signal 20 using the spatial parameters 23. In other words, Figure 4 illustrates a parametric stereo decoder that takes a compatible bitstream as generated by an inventive encoding device 50 as input and generates the stereo audio signal comprising the channels 1 and r, with¬out using or without having access to the part of the bit-stream that describes the residual signal. First a demulti¬plexer takes the compatible bitstream as input and decom¬poses it into one audio bitstreams and the PS sideinfo. The perceptual audio decoder produces a mono signal m, and the PS sideinfo is decoded into PS parameters. The PS synthesis converts the mono signal into left and right signals 1 and r in accordance with the PS-parameters, in particular by adding a decorrelated signal in order to retain the channel correlation of the original stereo channels Fig. 5 shows an inventive multi-channel-audio encoder 100 that encodes a 6-channel audio signal into a stereo downmix and a number of parameter sets. The multi-channel audio encoder 100 comprises a first adap¬tive encoder 102, a second adaptive encoder 104, estimation module 106, a parameter extractor 108, and a 3 to 2 down-mixer 110. The first adaptive encoder 102 and the second adaptive en¬coder 104 are embodiments of an inventive encoder 10. The 6 channel input signal is having a left front channel 112a, a left rear channel 112b, a right front channel 114a, a right rear channel 114b, a center channel 116a, and a low fre¬quency enhancement channel 116b. The left front chan¬nel 112a and the left rear channel 112b are input into the first adaptive encoder 102 that derives a first downmix signal 118a, the corresponding residual signal 118b and spatial parameters 118c. The right front channel 114a and the right rear channel 114b are input into the second adap-tive encoder 104, that derives a second downmix sig¬nal 120a, the corresponding residual signal 120b, and the underlying spatial parameters 120c. The center channel 116a and the low frequency enhancement channel 116b are input into the summation module 106, that adds the signals to create a mono signal 122a and corresponding spatial parame¬ters 122b. The 3 to 2 down-mixer 110 receives the downmix sig¬nals 118a, 120a, and 122a to down-mix them into a stereo output signal 124 having a left and a right channel. The 3 to 2 down-mixer additionally derives a residual signal 126 from the input channels 118a, 120a, and 122a. Furthermore, the 3 to 2 down-mixer 110 derives a parameter set 128 from the parameter sets 118b, 120b, and 122b. Summarizing shortly, Fig. 5 illustrates a part of a spatial audio encoder that takes as input a multi-channel audio signal in 5.1 format, comprising the channels Lf (left front), Lr (left surround), Rf (right front), Rr (right surround), C (centre) and LFE (low-frequency efficient), and that creates a stereo down-mix, comprising L0 and R0, and a number of parameter sets. Not shown in this figure are time to frequency transforms, coding of the down-mix signals and parameters, and multiplexing the coded informa¬tion into a bit-stream which can be decoded by a corre¬sponding spatial audio decoder. The adaptive down-mix takes as input the signals Lf and Lr and produces a mono signal L and a residual signal L. The parametric stereo (PS) parame¬ter estimation takes the two-channel signal Lf and Lr as input and generates a set of PS parameters. The instability limiter modifies the PS parameters that control the adap¬tive down-mix. In a similar manner, the adaptive down-mix takes as input the signals Rf and Rr and produces a mono signal R and a residual signal R. The parametric stereo (PS) parameter estimation takes the two-channel signal Rf and Rr as input and generates a set of PS parameters. The instability limiter modifies the PS parameters that control the adaptive down-mix. The summation module adds the sig¬nals C and LFE to create a mono signal C. The parametric stereo (PS) parameter estimation takes the two-channel sig¬nal C and LFE as input and generates a set of IID parame¬ters, a subset of PS parameters. The mono signals L, R and C are mixed to a stereo signal (Lo and Ro) and a residual signal Eo by the 3 to 2 module. The 3 to 2 module also out-puts a parameter set {Lo, Ro}. Fig. 6 describes an inventive audio decoder 140, comprising an up-mixer 142, and a limiter 144. The inventive decoder 140 receives a downmix signal 146, a residual signal 148 and spatial parameters 150. The downmix signal 146 and the residual signal 148 are input into the up-mixer 142, whereas the spatial parameters 150 are input into the limiter 144. The limiter 144 limits the spatial parameters 150 to derive limited spatial parameters 152. It is important to note, that the limiter is using the same limiting rule to derive the limited parameters as the cor¬responding encoder during the encoding process. The limited parameters are used to control the up-mixing process in the up-mixer 142 that derives a stereo signal 154 having a left and a right channel from the downmix signal 146 and the re¬sidual signal 148. Fig. 7 shows a block diagram illustrating the principle of an inventive decoder. In a first limiting step 160 the re¬ceived spatial parameters ICC and IID are limited. That is, it is checked whether the received ICC parameter exceeds a minimum ICC parameter ICCmin(IID) . If this is the case, the spatial parameters 150 (ICC and IID), a received downmix signal 146, and a received residual signal 148 are trans¬mitted to the up-mixing step 162. If the ICC parameter does not exceed the minimum ICC parameter ICCmin(HD), a limiting step 164 is additionally performed, where the value of the ICC parameter is exchanged by the value of the parameter ICCmin(HD), having the effect, that the value of ICCmin(HD) is transmitted to the up-mixing step 162. In the up-mixing step 162, a stereo signal 154 having a left and a right channel is derived from the downmix sig¬nal 146 and the residual signal 148, using the spatial pa¬rameters ICC and IID. Fig. 8 shows a further embodiment of an inventive decoding device 180 that comprises a decoder 140, a signal-processing unit 182 having a first audio decoder 184, a second audio decoder 186 and a parameter decoder 188. The decoding device 180 further comprises an input inter¬face 190 for receiving a combined bit stream 192, that is generated by an inventive encoding device 50. The combined bit stream 192 is decoposed by the input in-terface 190 to a first audio bit stream 194a, a second au¬dio bit stream 194b and a parameter bit stream 196. The first audio bit stream 194a is input into the first au¬dio decoder 185, the second audio bit stream 194b is input into the second audio decoder 186, and the parameter bit stream 196 is input into the parameter decoder 188. The de¬compressed downmix signal 198 (m) and the residual sig¬nal 200 (s) are input into the up-mixer 142 of the de¬coder 140. Spatial parameters 202 derived by the parameter decoder 188 are input into the limiter 144 of the audio de¬coder 140. The limiting of the spatial parameters and the up-mixing have already been described within the descrip¬tion of the audio decoder 140. A detailed description can be obtained from the corresponding paragraphs of the de¬scription of Fig. 6. The inventive decoding device 180 finally outputs a stereo signal 204, having a left and a right channel. In other words, fig. 8 illustrates a parametric stereo de¬coder that takes a compatible bitstream as input and gener¬ates the stereo audio signal comprising the channels 1 and r. First a demultiplexer takes the compatible bit stream as input and decomposes it into two audio bit streams and the PS side info. Perceptual audio decoders produce a mono sig¬nal m and a residual signal s respectively, and the PS side info is decoded into PS parameters by the parameter de¬coder. The instability limiter modifies the PS parameters. The up-mixer converts the mono and residual signals into left and right signals 1 and r by means of a rotation ma¬trix defined from the PS parameters modified by the insta¬bility limiter. Fig. 9 shows an inventive multi-channel audio decoder 210 comprising a first two-channel decoder 212, a second two-channel decoder 214, a synthesis module 216, and a 2 to 3 module 218. Figure 9 illustrates part of a spatial audio decoder that takes as input a stereo audio signal (comprising the Lo and Ro) , a residual signal Eo and a parameter set {Lo, Ro}. The 2 to 3 module 218 produces three audio chan¬nels L, R, and C from the above-mentioned input. The mono channel L and the residual channel L are converted by a first two-channel decoder 211 into the Lf and Lr output signals. The instability limiter modifies the PS parameter set L. Similarly, the mono channel R and the residual channel R are converted by a second two-channel decoder 214 into the Rf and Rr output signals. The instability limiter is the same as used during the generation of the mono channel R and modifies the PS parameter set R. The PS synthesis module 216 takes the mono channel C and parame¬ter set C and generates the C and LFE output channels. Fig. 10 and 11 show an alternative solution for an encoder and a decoder avoiding the instability problem. The alter¬native is based on using the limited spatial parameters as the parameters to be encoded and transmitted. This can be seen in the inventive encoder in Fig. 10 that is based on the inventive encoding device of Fig. 3. Fig. 10 shows a modification of an inventive encoder al-ready shown in Fig. 3, with the difference, that the pa-rameters fed into the parameter encoder 56 are taken at a point 300, i.e. after the limiting process. That is, the limited parameters are encoded and transmitted instead of the original parameters. On the decoder side shown in Fig. 11, the modification that the limiter can be omitted compared to the decoding device 180. Therefore, the decoded spatial parameter 310 is input directly into the up-mixer 142 to derive the stereo sig¬nal 204. The disadvantages of this solution compared to the place¬ment of instability limiters as taught before and shown in the previous figures are twofold. First, the quantization of the limited parameters would move the rotators further away from the optimality then necessary. The size of the residual therefore would be larger in general, leading to a loss in encoding gain for the residual coding method. Sec¬ond, backwards compatibility to parametric-stereo decoding would be lost. In critical cases, when the channel correla¬tion of the original channel is negative, the decoder would not be able to reproduce this correlation without access to the residual signal. Fig. 12 is showing an inventive audio transmitter or re¬corder 330 that is having an audio encoder 50, an input in¬terface 332 and an output interface 334. An audio signal can be supplied at the input interface 332 of the transmitter/recorder 330. The audio signal is en¬coded by an inventive encoder 50 within the transmit¬ter/recorder and the encoded representation is output at the output interface 334 of the transmitter/recorder 330. The encoded representation may then be transmitted or stored on a storage medium. Fig. 13 shows an inventive receiver or audio player 340, having an inventive audio decoder 180, a bit stream in-put 342, and an audio output 344. A bit stream can be input at the input 342 of the inventive receiver/audio player 340. The bit stream then is decoded by the decoder 180 and the decoded signal is output or played at the output 34 4 of the inventive receiver/audio player 340. Fig. 14 shows a transmission system comprising an inventive transmitter 330, and an inventive receiver 340. The audio signal input at the input interface 332 of the transmitter 330 is encoded and transferred from the out¬put 334 of the transmitter 330 to the input 342 of the re¬ceiver 340. The receiver decodes the audio signal and plays back or outputs the audio signal on its output 344. The above-mentioned and described embodiments of the pre¬sent invention are merely illustrative for the principles of the present invention for the improvement of adaptive residual coding. It is understood that modifications and variations of the arrangements and details described herein will be operand to others skilled in the art. It is the in¬tent, therefore, to be limited only by the scope of the im¬pending patent claims and not by the specific details pre¬sented by way of description and explanation of the embodi¬ments herein. Although the embodiments of the present invention described in the figures above are described using mainly a nomencla¬ture used for stereo signals, it is apparent that the pre¬sent invention is not limited to stereo signals but could be applied to any other kind of combination of two audio signals, as for example done within the multi-channel audio encoders and decoders shown in Fig. 5 and Fig. 9. Using an inventive transmission system having a transmitter and a receiver, the transmission between the transmitter and the receiver can be achieved by various means. This can be for example life streaming over the internet or other network media, storing a file on a computer readable media and transferring the media, directly connecting the trans¬mitter and the receiver by cable or wireless such as wire- less LAN or Bluetooth and any other imaginable data connec-tion. Although it has been described in detail, that the ICC pa-rameter only is to be changed to assure a non-diverging up-and downmix matrix, it is also possible to limit both the IID and IIC parameters such that no divergence will occur. More generally, applying the inventive concept can also mean deriving other spatial parameters and applying a lim¬iting rule to these parameters, assuring for a non-diverging down- and up-mix. The output and input interfaces in the inventive encoders and decoders are not limited to simple multiplexers or de-multiplexers only. In a more sophisticated variation, the output interface may combine the bit streams not by just multiplexing them but by any other means, possibly even by trying some further entropy coding to reduce the size of the bit stream. Depending on certain implementation requirements of the in-ventive methods, the inventive methods can be implemented in hardware or in software. The implementation can be per¬formed using a digital storage medium, in particular a disk, DVD or a CD having electronically readable control signals stored thereon, which cooperate with a programmable computer system such that the inventive methods are per¬formed. Generally, the present invention is, therefore, a computer program product with a program code stored on a machine-readable carrier, the program code being operative for performing the inventive methods when the computer pro¬gram product runs on a computer. In other words, the inven¬tive methods are, therefore, a computer program having a program code for performing at least one of the inventive methods when the computer program runs on a computer. While the foregoing has been particularly shown and de¬scribed with reference to particular embodiments thereof, it will be understood by those skilled in the art that various other changes in the form and details may be made without departing from the spirit and scope thereof. It is to be understood that various changes may be made in adapt¬ ing to different embodiments without departing from the broader concepts disclosed herein and comprehended by the claims that follow. , We Claim: 1. Audio encoder (10) for encoding an audio signal having at least two channels, comprising: a parameter extractor (16) for deriving a coherence parameter (ICC) describing a coherence between a first channel and second channel of the at least two channels and a level parameter (IID) describing a level difference between the first channel and the second channel as spatial parameters; a limiter (14) for limiting the coherence parameter to derive a limited coherence parameter,wherein the limit of the coherence parameter depends on the level parameter and on a scaling factor; a down-mixer (12) for deriving a downmix signal (20) and a residual signal (18) from the audio signal using a down-mixing rule depending on the limited coherence parameter. 2. The audio encoder (10) as claimed in claim 1, wherein the parameter extractor (16) is operative to derive multiple spatial parameters for a given time portion of the audio signal. 3. The audio encoder (10) as claimed in claim 1 or 2, wherein the limiter (14) is operative to limit the coherence parameter such that a ratio of intensities between the down-mix signal (20) and the at least two channels does not exceed a predefined limit. 4. The audio encoder (10) as. claimed in claim 4, wherein the predefined gain factor is selected from the interval [1, 2]. 5. The audio encoder (10) as claimed in any of claims 1 to 4, wherein the down-mixer (12) is operative to use a down-mixing rule such that the downmix signal (20) and the residual signal (18) are derived by forming a linear combination of the channels from the at least two channels, wherein the coefficients of the linear combination are depending on the limited coherence parameter. 5. The audio encoder (10) as claimed in any claims 1 to 5, comprising a signal processing unit (51) for processing or transmitting the downmix signal (20), the residual signal (18), and the spatial parameters to derive a processed downmix signal, a processed residual signal, and processed parameters. 7. Audio encoder (10) as claimed in claim 6, wherein the signal processing unit (51) is operative to derive the processed downmix signal, the processed residual signal, and the processed parameters such that the deriving includes a compression of the downmix signal (20), the residual signal (18), and the spatial parameters. 8. The audio encoder (10) as claimed in claim 7 or 8, comprising an output interface (58) for providing the information of the processed downmix signal (20), the processed residual signal (18), and the processed spatial parameters. 9. The audio encoder (10) as claimed in claim 8, wherein the output interface (58) is operative to combine the processed downmix signal, the processed residual signal,and the processed parameters to derive an out-put bit stream having the information of the processed downmix signal, the processed residual signal and the processed parameters. 10. The audio encoder (10) as claimed in claim 9, wherein the output interface (58) is operative to multiplex the processed downmix signal, the processed residual signal, and the processed parameters to derive the output bit stream. ll.The audio encoder (10) as claimed in any of claims 1 to 10, wherein multiple pairs of channels are encoded, and wherein for each pair of channels a spatial parameter, a downmix signal (20) and a residual signal (18) is derived. 12. The audio encoder (10) as claimed in claim 11, wherein the multiple pairs of channels comprise a left front, a left rear, a right front, a right rear, a low frequency enhancement and a center channel. 13. An audio decoder (140) for decoding an encoded audio signal representing an original audio signal having at least two channels, the encoded audio signal having a downmix signal, and a residual signal as well as a coherence parameter (ICC) describing a coherence between a first and a second channel of the at least two channels and a level parameter (IID) describing a level difference between the first and the second channel as spatial parameters , comprising: a limiter (144) for limiting the coherence parameter to derive a limited coherence parameter, wherein the limit of the coherence parameter depends on the level parameter and on a scaling factor; and an up-mixer (142) for deriving a reconstruction of the original audio signal (154) from the downmix signal and the residual signal using an up-mixing rule depending on the limited coherence parameter. 14. The audio decoder (140) as claimed in claim 13, wherein the limiter (144) is operative to limit multiple coherence parameters for a given time portion of the encoded audio signal corresponding to a time frame of the original audio signal. 15. The audio decoder (140) as claimed in claim 13 or 14, wherein the limiter (144) is operative to limit the coherence parameter such that a ratio of intensities between the downmix signal and the at least two channels of the original audio signal does not exceed a predefined limit.. 16. The audio decoder (140) as claimed in claim 13, wherein the predefined gain factor is chosen from the interval [1, 2]. 17. The audio decoder (140) as claimed in any of claims 13 to IB, wherein the up-mixer (142) is operative to use an up-mixing rule such that a first reconstructed channel and a second reconstructed channel of the at least two channels are derived by forming a linear combination of the downmix signal and the residual signal, wherein the coefficients of the linear combination are depending on the limited coherence parameter. 18. The audio decoder (140) as claimed in any of claims 13 to 17, comprising a signal processing unit (182) for transmitting or processing a processed residual signal, a processed downmix signal, and processed parameters to derive the residual signal, the downmix signal, and the spatial parameters. 19. The audio decoder (140) as claimed in claim 18, wherein the signal processing unit (182) is operative to derive the residual signal, the downmix signal, and the spatial parameter such that the deriving of the residual signal, the downmix signal and the spatial parameters includes decompression of the processed residual signal, the processed downmix signal, and the processed spatial parameters. 20 The audio decoder (140) as claimed in claims 18 or 19, comprising an input interface (190) for providing the processed residual signal, the processed downmix signal and the processed spatial parameters. 21. The audio decoder (140) as claimed in claim 20, wherein the input interface (190) is operative to decompose a single input bit stream to derive the processed residual signal, the processed downmix signal and the processed parameters. 22. The audio decoder (140) as claimed in claim 21, wherein the input interface (190) is operative to decompose the single input bit stream such that the deriving of the processed residual signal, the processed downmix signal and the processed parameters includes a de-multiplexing of the input bit stream. 23. Method for encoding an audio signal having at least two channels, the method comprising: deriving a coherence parameter (ICC) describing a coherence between a first and a second channel of the at least two channels and a level parameter (IDD) describing a level difference between the first and the second channel as spatial parameters; limiting the coherence parameter to derive a limited coherence parameter, wherein the limit of the coherence parameter depends on the level parameter and on a scaling factor; and deriving a downmix signal and a residual signal from the audio signal using a down-mixing rule depending on the limited coherence parameter. 24. A method for decoding an encoded audio signal representing an original audio signal having at least two channels, the encoded audio signal having a downmix signal and a residual signal as well as a coherence parameter (ICC) describing a coherence between a first and a second channel of the at least two channels and a level parameter (IID) describing a level difference between the first and second channel as spatial parameters , the method comprising: limiting the coherence parameter to derive a limited coherence parameter, wherein a limit of the coherence parameter depends on the level parameter and on a scaling factor ;and deriving a reconstruction of the original audio signal from the downmix signal and the residual signal using an up-mixing rule depending on the limited coherence parameter. ABSTRACT ADAPTIVE RESIDUAL AUDIO CODING This invention relates to audio encoder(lO) for encoding an audio signal having at least two channels (18) comprising a parameter extractor (16) for deriving a coherence parameter (ICC) describing a coherence between a first channel and a second channel of the at least two channels and a level parameter (IID) describing a level difference between the first channel and the second channel as spatial parameters; a limiter (14) for limiting the coherence parameter to derive a limited coherence parameter, wherein the limit of the coherence parameter depends on the level parameter and on a scaling factor; a down-mixer (12) for deriving a downmix signal (20) and a residual signal (18) from the audio signal using a down-mixing rule depending on the limited coherence parameter.

Full Text

ADAPTIVE RESIDUAL AUDIO CODING
Field of the invention
The present invention relates to the encoding and decoding of audio signals and in particular to the efficient high-quality coding of a pair of audio channels.
Background of the invention prior art
Recently, effective high-quality coding of audio signals has become more and more important, as digital distribution of compressed audio and video content, e.g. by satellite or by terrestrial digital audio- or video-broadcasting is widely used. The well-known MP3 technique, for example, al¬lows for convenient transmission of audio titles over the internet or other transmission channels having limited bandwidths.
In addition to MP3, several other audio coding schemes aim to maximize the audio quality for a given compression ratio or bit rate. It has been shown in "Efficient and scalable Parametric Stereo Coding for Low Bit rate Audio Coding Ap¬plications", PCT/SE02/01372, that it is possible to recre¬ate a stereo signal that closely resembles the underlying original stereo image, from a mono signal when additionally a very compact representation of the stereo signal commonly referred to as "spatial cues" is used. The disclosed prin¬ciple is to divide the stereo input signal into frequency bands and to estimate parameters called inter-channel in¬tensity difference (IID) and inter-channel coherence (ICC) for each of the frequency bands separately. The first pa-rameter describes a measurement of the power distribution between the two channels in the specific frequency band and the second parameter describes an estimation of the corre¬lation between the two channels. A more thorough descrip¬tion of spatial parameters may be found in "High-quality parametric spatial audio coding at low bit rates"

J. Breebaart, S. van de Par, A. Kohlrausch and E. Schuijers, Proc. 116th AES Convention, Berlin (Germany), May 8-11, 2004. Based on these spatial cues, the stereo in¬put signal is adaptively combined into a mono signal. Both the spatial cues and the mono signal are coded and the coded representation is multiplexed into a bit-stream, that is transmitted to the decoder. On the decoder side the ste¬reo image is recreated from the mono signal by distributing the energy of the mono signal between the two output chan¬nels in accordance with the IID-data, and by adding a decorrelated signal in order to retain the channel correla¬tion of the original stereo channels, as it is described by the IIC parameters.
When more transmission bandwidth is available, a higher au¬dio quality can be achieved by replacing the decorrelated mono-signal in the decoder by a transmitted residual sig¬nal. That is, the transmission of an additional residual signal to a decoder is required. This is also the case with mid-side (MS) coding, where the sum and the difference of the channels of a stereo signal are coded rather than the left and right channels directly. A description of the MS technique may be found in "Sum-difference stereo transform coding", Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), San Francisco, USA, 1992, pp. II 569 - 572. MS coding is based on the finding, that the left and the right channel of a stereo signal are being rather similar with a high probability. Therefore, a difference of the left and the right channel will yield a signal having a compara¬tively low intensity most of the time, i.e. the amplitude of the difference signal will be rather small. Hence, one can save a significant amount of bit rate when encoding the difference signal, since the parameters describing the dif¬ference signal can be coarsely quantized. The sum signal will evidently need about the same bandwidth than a single left or right channel, when encoded. Therefore, one can save a significant amount of bandwidth in total when using the MS coding scheme. When a large intensity difference be-

tween the left and the right channel exists, the MS tech¬nique has its limits, since then also the difference chan¬nel will contain a substantial amount of energy and there¬fore needs a higher bandwidth. It may be noted, however, that in regular stereo-coded implementations, MS coding will not be applied in this case, due to high encoding costs. In those cases, it is advantageous to have the pos¬sibility to switch between normal stereo coding and MS cod¬ing, depending on the intensity carried by the original au¬dio channels that have to be encoded.
By replacing the static concept of building the sum and the difference of two stereo channels that are to be encoded by inventing a decoder rotator matrix with matrix elements that describe the composition of two intermediate channels that are a combination of the two stereo channels, one can overcome the above problem. The matrix elements are depend¬ing on parametric stereo parameters that are extracted from the left and the right channel of the stereo signal. Adap¬tive residual coding is such able to dynamically adapt the combination rule for the generation of intermediate chan¬nels to the properties of the present signal, achieving a significant performance gain over MS coding.
Choosing a suited dependency of the matrix elements of the so-called rotator matrix from the parametric stereo parame¬ters, one can achieve that the energy within a difference channel stays as minimal as possible, as shown already within the non-disclosed European patent application EP 04103168.3. As one introduces a rotator matrix to trans¬form (downmix or up-mix) the stereo signal to signals m and s (the intermediate signals, i.e. the downmix signal m and residual-signal s) , it is crucial for the operation of the method that the rotator matrices (the decoder rotator ma¬trix and the encoder rotator matrix) are bounded. This means that the matrix elements within the matrices do not diverge to infinity within the entire range of parametric stereo coding parameters possible. In other words, both ro-

tator matrices have to be bounded in the sense that the ma¬trix condition number is sufficiently small to allow prob¬lem-free matrix inversion for the entire range of paramet¬ric stereo coding parameters, which is not the case for im¬plementations according to prior art techniques.
Summary of the invention
It is the object of the present invention to provide a con¬cept for high quality audio coding yielding a highly com¬pressed representation of an audio signal simultaneously avoiding artefacts introduced by the coding or decoding more efficiently.
According to a first aspect of the present invention, this object is achieved by an audio encoder for encoding an au¬dio signal having at least two channels, comprising: a pa¬rameter extractor for deriving a spatial parameter from the audio signal, wherein the spatial parameter describes an interrelation between the at least two channels; a limiter for limiting the spatial parameter using a limiting rule to derive a limited spatial parameter, wherein the limiting rule depends on an interrelation between the at least two channels; and a down-mixer for deriving a downmix signal and a residual signal from the audio signal using a down-mixing rule depending on the limited spatial parameter.
According to a second aspect of the present invention, this object is achieved by an audio decoder for decoding an en¬coded audio signal representing an original audio signal having at least two channels, the encoded audio signal hav¬ing a downmix signal, a residual signal and a spatial pa¬rameter describing an interrelation between the at least two channels, comprising: a limiter for limiting the spatial parameter to derive a limited spatial parame¬ter using a limiting rule, wherein the limiting rule de¬pends on an interrelation between the at least two chan¬nels; and an up-mixer for deriving a reconstruction of the

original audio signal from the downmix signal and the re¬sidual signal using an up-mixing rule depending on the lim¬ited spatial parameter.
According to a third aspect of the present invention, this object is achieved by a method for encoding an audio signal having at least two channels, the method comprising: deriv¬ing a spatial parameter from the audio signal, wherein the spatial parameter describes an interrelation between the at least two channels; limiting the spatial parameter using a limiting rule to derive a limited spatial parameter, wherein the limiting rule depends on an interrelation be¬tween the at least two channels; and deriving a downmix signal and a residual signal from the audio signal using a down-mixing rule depending on the limited spatial parame¬ter.
According to a fourth aspect of the present invention, this object is achieved by a method for decoding an encoded au¬dio signal representing an original audio signal having at least two channels, the encoded audio signal having a down-mix signal, a residual signal and a spatial parameter de¬scribing an interrelation between the at least two chan¬nels, the method comprising: limiting the spatial parame¬ter to derive a limited spatial parameter using a limiting rule, wherein the limiting rule depends on an interrelation between the at least two channels; and deriving a recon¬struction of the original audio signal from the downmix signal and the residual signal using an up-mixing rule de¬pending on the limited spatial parameter.
According to a fifth aspect of the present invention, this object is achieved by a transmitter or audio recorder hav¬ing an audio encoder for encoding an audio signal having at least two channels, comprising: a parameter extractor for deriving a spatial parameter from the audio signal, wherein the spatial parameter describes an interrelation between the at least two channels; a limiter for limiting the spa-

tial parameter using a limiting rule to derive a limited spatial parameter, wherein the limiting rule depends on an interrelation between the at least two channels; and a down-mixer for deriving a downmix signal and a residual signal from the audio signal using a down-mixing rule de¬pending on the limited spatial parameter.
According to a sixth aspect of the present invention, this object is achieved by a receiver or audio player, having an audio decoder for decoding an encoded audio signal rep¬resenting an original audio signal having at least two channels, the encoded audio signal having a downmix signal, a residual signal and a spatial parameter describing an in¬terrelation between the at least two channels, comprising: a limiter for limiting the spatial parameter to derive a limited spatial parameter using a limiting rule, wherein the limiting rule depends on an interrelation between the at least two channels; and an up-mixer for deriving a reconstruction of the original audio signal from the down-mix signal and the residual signal using an up-mixing rule depending on the limited spatial parameter.
According to a seventh aspect of the present invention, this object is achieved by a method of transmitting or au¬dio recording the method having a method of generating an encoded signal, the method comprising a method for encoding an audio signal having at least two channels, the method comprising: deriving a spatial parameter from the audio signal, wherein the spatial parameter describes an interre¬lation between the at least two channels; limiting the spatial parameter using a limiting rule to derive a limited spatial parameter, wherein the limiting rule depends on an interrelation between the at least two channels; deriving a downmix signal and a residual signal from the audio signal using a down-mixing rule depending on the limited spatial parameter.

According to an eighth aspect of the present invention, this object is achieved by a method of receiving or audio playing, the method having a method for decoding an encoded audio signal, the method comprising a method for decoding an encoded audio signal representing an original audio sig¬nal having at least two channels, the encoded audio signal having a downmix signal, a residual signal and a spatial parameter describing an interrelation between the at least two channels, the method comprising: limiting the spatial parameter to derive a limited spatial parameter using a limiting rule, wherein the limiting rule depends on an in-terrelation between the at least two channels; and deriving a reconstruction of the original audio signal from the downmix signal and the residual signal using an up-mixing rule depending on the limited spatial parameter.
According to a nineth aspect of the present invention, this object is achieved by a transmission system having a transmitter and a receiver, the transmitter having an audio encoder for encoding an audio signal having at least two channels, comprising: a parameter extractor for deriving a spatial parameter from the audio signal, wherein the spa¬tial parameter describes an interrelation between the at least two channels; a limiter for limiting the spatial pa¬rameter using a limiting rule to derive a limited spatial parameter, wherein the limiting rule depends on an interre¬lation between the at least two channels; and a down-mixer for deriving a downmix signal and a residual signal from the audio signal using a down-mixing rule depending on the limited spatial parameter; and the receiver having an audio decoder for decoding an encoded audio signal representing an original audio signal having at least two channels, the encoded audio signal having a downmix signal, a residual signal and a spatial parameter describing an interrelation between the at least two channels, comprising: a limiter for limiting the spatial parameter to derive a limited spa¬tial parameter using a limiting rule, wherein the limiting rule depends on an interrelation between the at least two

channels; and an up-mixer for deriving a reconstruction of the original audio signal from the downmix signal and the residual signal using an up-mixing rule depending on the limited spatial parameter.
According to a tenth aspect of the present invention, this object is achieved by a method of transmitting and receiv¬ing, the method including a transmitting method having a method of generating an encoded signal of an audio signal having at least two channels, the method comprising: deriv¬ing a spatial parameter from the audio signal, wherein the spatial parameter describes an interrelation between the at least two channels; limiting the spatial parameter using a limiting rule to derive a limited spatial parameter, wherein the limiting rule depends on an interrelation be¬tween the at least two channels; and deriving a downmix signal and a residual signal from the audio signal using a down-mixing rule depending on the limited spatial parame¬ter; and a receiving method, having a method for decoding an encoded audio signal, the method comprising: limiting the spatial parameter to derive a limited spatial parame¬ter using a limiting rule, wherein the limiting rule de¬pends on an interrelation between the at least two chan¬nels; and deriving a reconstruction of the original audio signal from the downmix signal and the residual signal us¬ing an up-mixing rule depending on the limited spatial pa¬rameter.
According to an eleventh aspect of the present invention, this object is achieved by an encoded audio signal being a representation of an audio signal having at least two chan-nels, the encoded audio signal having a spatial parame-ter describing an interrelation between the at least two channels, a downmix signal and a residual signal, wherein the downmix signal and the residual signal are derived from the audio signal using a down-mixing rule depending on a limited spatial parameter derived using a limiting rule de¬pending on an interrelation of the at least two channels.

The present invention is based on the finding that an audio signal having at least two channels can be efficiently down-mixed into a downmix signal and a residual signal, when the down-mixing rule used depends on a spatial parame¬ter that is derived from the audio signal and that is post-processed by a limiter to apply a certain limit to the de¬rived spatial parameter with the aim of avoiding instabili¬ties during the up-mixing or down-mixing process. By having a down-mixing rule that dynamically depends on parameters describing an interrelation between the audio channels, one can assure that the energy within the down-mixed residual signal is as minimal as possible, which is advantageous in the view of coding efficiency. By post processing the spa¬tial parameter with a limiter prior to using it in the down-mixing, one can avoid instabilities in the down- or up-mixing, which otherwise could result in a disturbance of the spatial perception of the encoded or decoded audio sig¬nal.
In one embodiment of the present invention, an original stereo signal having a left and a right channel is supplied to a down-mixer and a parameter extractor. The parameter extractor derives the commonly known spatial parameters ICC (Inter-Channel-Correlation) and IID (Inter-Channel-Inten-sity-Difference). The down-mixer is able to downmix the left and right channels into a downmix signal and a resid¬ual signal, wherein the down-mixing rule is such that the resulting residual signal carries minimum achievable en¬ergy. Therefore, subsequent compression of the resulting residual signal by a standard audio encoder will result in an extremely compact code. This can be achieved by formu¬lating the down-mixing rule in dependence of the spatial parameters ICC and IID, since both of the parameters are describing intensity- or amplitude ratios of the original stereo channels. A general problem during encoding is en¬ergy preservation. It is necessary that both the original signal and the encoded signal contain the same energy,

since a violation of the energy conservation would result in a different loudness perception of the encoded signals or even in uncontrollable jumps in the loudness of the en¬coded signal. Therefore, in the above encoding scheme the downmix signal and the residual signal have to be scaled by a scaling factor that ensures the energy conservation rule.
If the original audio signal that is to be encoded has spe¬cial properties, this scaling factor can diverge, in par¬ticular when the left and the right original channel are perfectly anti-correlated, i.e. when they have the same am¬plitudes and a phase shift of precisely 180 °. This insta¬bility is avoided within the inventive concept by applying a limiting function to the ICC parameter, wherein the lim¬iting function depends on a maximum acceptable scaling fac¬tor and the IID parameter. To avoid a possible divergence, the rule that describes the down mixing is altered di¬rectly, whereas in state of the art implementations the scaling factor is simply limited by setting a threshold and where the scaling factor is replaced by the threshold value when exceeding the threshold.
It is a big advantage of the inventive concept, that both the signal within the downmix channel and the residual channel is altered through altering the parameters that are underlying the down-mixing process. Only the signal in the downmix channel would be influenced when applying a thresh¬old according to prior art, thus a better preservation of the inter-relation between the original left and right channel can be achieved when following the inventive con¬cept.
Another advantage of the concept described above is, that the spatial parameters used are generally derived during an encoding process. Therefore one can implement the necessary limiting logic without having to introduce new parameters.

In a further embodiment of the present invention a limiter is applied at the decoder side, having the same limiting rule than a limiter on the encoder side. This means that on the decoder side, the downmix and the residual signal as well as the spatial parameters IID and ICC are received, and the received spatial parameters are limited using the same limiting rule used during the encoding process. The upmixing is then dependent on the limited spatial parame¬ters, assuring for a non-occurring divergence in the up-mixing process. The advantage of having the same limiting rules in the encoding and the decoding is obvious, since one only has to develop hardware circuits or an implementa¬tion of a software algorithm once. Hard- or Software having as well encoding as decoding functionality, can be devel-oped at lower costs, since one is able to reuse the same hard- or software for the limiting functionality.
In a further embodiment of the present invention, the down-mixed signals and the spatial parameters are compressed af¬ter their generation, yielding two audio bit streams for the down-mixed signals and a parameter bit stream holding the compressed spatial parameters. This reduces the size of the encoded representation to be transmitted, further sav¬ing bandwidth, wherein the encoding may be lossy or loss¬less, since the encoding rule itself is independent of the inventive concept. An inventive decoder according to the inventive concept then comprises a decompression stage, where the compressed representations are decompressed into the spatial parameters, the down-mixed channel and the re-sidual channel prior to up-mixing.
In another embodiment of the present invention, the already compressed audio bit streams and the parameter bit stream are combined into a combined bit stream, e.g. by multiplex¬ing, allowing for a convenient storage of a generated file on a storage medium. This also allows for streaming appli¬cations, for example, streaming the encoded content via the internet, since all the relevant information is comprised

in one single file or bit stream, allowing for a more con-venient handling than in a case, where three separate bit streams would be transferred. The corresponding inventive decoder then has a decombination stage, which could for ex-ample be a demultiplexer to decombine the bit stream into three separate bit streams, namely the two audio bit streams and the parameter bit stream.
It is to be noted here that the inventive concept provides a perfect backward-compatibility to prior art residual cod¬ing, where the spatial parameters are not limited and even to prior art parametric stereo coding, where a decoder does not make use of the residual signal. This is of course a major advantage, since newly encoded audio data can be re¬produced with maximum possible quality by inventive decod¬ers, whereas it may also be reproduced already existing de¬coders according to prior art.
In a further embodiment of the present invention, three in-ventive encoders are combined to encode a multi-channel au¬dio signal comprising six individual channels, wherein each of the three inventive encoders encodes a pair of channels, deriving spatial parameters, a downmix and a residual sig¬nal for each of the channel pairs. The inventive concept can thereby also be used to encode multi-channel audio sig¬nals where the efficiency of the coding and the compactness of the resulting representation has an even higher prior¬ity, since the total amount of data to be encoded and transmitted is much higher than for a stereo signal. In principle, an arbitrary number of inventive audio encoders can be combined to simultaneously encode a multi-channel audio signal having basically any number of single audio channels. In a further embodiment of the multi-channel au¬dio encoder, the individual downmix signals and residual signals as well as the individual parameter bit streams are combined by a 3 to 2 down-mixer to receive a common left signal, a common right signal, and a common residual signal and a combined parameter bit stream, further reducing the

amount of required bandwidth. The corresponding decoders straightforwardly comprise a 2 to 3 up-mixer stage then.
In another embodiment of the present invention, a transmit¬ter or audio recorder is comprising an inventive encoder, allowing for compact, high-quality audio recording or transmitting, wherein the size of the transmitted or stored audio content can be significantly reduced. Such audio con¬tent can be stored on a storage medium of a given capacity or less bandwidth is used during transmission of the audio signal.
In another embodiment a receiver or audio player is having an inventive decoder, allowing for streaming applications in limited bandwidth environments such as mobile phones or allowing for construction of small portable play-back de¬vices, using storage media of limited capacity.
A combination of an inventive transmitter and receiver yields a transmission system, allowing conveniently trans-mitting audio content via wired or wireless transmission interfaces, such as wireless LAN, Bluetooth, wired LAN, power line technologies, radio transmission, or any other type of data transmission.

Preferred embodiments of the present invention are subse-quently described by referring to the enclosed drawings, wherein:
Fig. 1 shows a block diagram of an inventive encoder;
Fig. 2 shows a block diagram of the inventive encoding principle;
Fig. 3 shows another embodiment of an inventive encoder;

Fig. 4 shows the backwards compatibility of the inven¬tive encoding scheme to prior art decoders;
Fig. 5 shows an inventive multi-channel audio encoder;
Fig. 6 shows a block diagram of an inventive audio de-coder;
Fig. 7 shows a block diagram of the inventive decoding concept;
Fig. 8 shows a further embodiment of an inventive de-coder;
Fig. 9 shows an embodiment of an inventive multi-channel audio decoder;
Fig. 10 shows an alternative embodiment of an inventive audio encoder;
Fig. 11 shows an alternative embodiment of an inventive audio decoder;
Fig. 12 shows an inventive transmitter/audio-recorder;
Fig. 13 shows an inventive receiver/audio-player;
Fig. 14 shows an inventive transmission system.
Detailed description of preferred embodiments
Fig. 1 shows a block diagram of an inventive audio en-coder 10, comprising a down-mixer 12, a limiter 14, and a parameter extractor 16.
A stereo signal 18, having a left and a right channel, is input into the down-mixer 12 and into the parameter extrac¬tor 16 simultaneously. The parameter extractor 16 extracts

spatial parameters 19 describing an interrelation between the left and the right channel of the stereo signal 18. These parameters are on the one hand made available for transmission and on the other hand input into the lim-iter 14. The limiter 14 applies a limiting rule to the pa¬rameters. The details of an appropriate limiting rule shall be derived in the following paragraphs.
The limiter derives limited spatial parameters and these are input into the down-mixer 12, wherein the down-mixer 12 applies a down-mixing rule to the left and right channel of the stereo signal 18 to derive a downmix signal 20 and a residual signal 22 from the left and the right channel of the stereo signal. The down-mixing rule is additionally de¬pending on the limited spatial parameter.
When choosing an appropriate limiting rule for the limiter, the down-mixer 12 is only supplied with limited parameters that are limited in a way that the down-mixing rule does not diverge or produce any output that is deteriorating a spatial interrelation of the left and the right channel be¬cause of the down-mixing.
As a result, the stereo signal 18 is represented by the downmix signal 20, the residual signal 22, and the spatial parameters 19 after the encoding process performed by the audio encoder 10.
To understand how a down-mixing rule and a limiting rule have to interrelate to provide a resulting residual sig-nal 22 containing minimal feasible energy while simultane¬ously limiting a spatial parameter such that the down-mixing rule does not cause any divergences, the basic con¬cept underlying the present invention is elaborated in more detail in the following few paragraphs.
The parameters extracted by the parameter extractor 16 typically result from a single time and frequency interval

of sub-band samples from a complex modulated filter bank analysis of discrete time signals. That means that the au¬dio signal of the left and right channel of the stereo sig¬nal 18 is first divided into time frames of a given length, and within a single time frame, the frequency spectrum is sub-divided into a number of sub-band samples. For each single sub-band, the parameter extractor 16 then derives a spatial parameter by comparing the left and right channels of the stereo signal within the sub-band of interest. Therefore, the left and the right channel of the stereo signal 18 and the downmix signal m and the residual sig¬nal s from Fig. 1 have to be understood as discrete and fi¬nite length vectors, describing the underlying signals within a discrete time interval. As mentioned above, during a down-mixing, energy preservation must be assured. For discrete complex vectors x, y, the complex inner product and squared norm (comparable to energy) is defined by

Following the normal convention, a * denotes complex conju¬gation. From here on, upper case letters describe the squared sum or energy, of the corresponding finite length complex vectors denoted by lower case letters.
According to the present invention, the downmix channel m resulting from the adaptive downmix is the energy weighted sum of the original left and right channel, and thus de¬fined by

where g is a real and positive gain factor adjusted such that the energy of the downmix (M) equals the sum of ener¬gies of the left (L) and (R) channel signal vectors (M = L + R) .
As this gain factor diverges to infinity when 1 and r are out of phase and have comparable energy (i.e. l+r=0 in equation No. 2), it is necessary to limit this factor by a maximal gain factor g0 that is typically within the inter¬val [1,2]. The parameter extractor 16, as shown in Fig. 1, extracts the spatial audio parameters IID (Interchannel In¬tensity Difference) and ICC (Interchannel Coherence) that are represented here by

Here, c denotes the IID-parameter and p denotes the ICC-parameter. The gain factor g can be expressed depending on the ICC and IID parameters and such the required limitation of the gain factor can be written as follows:

To achieve maximum coding efficiency, it is desired that the energy within the residual signal 22 is minimal. The following derivation solves a more general optimization problem comprising an additional residual signal t, which then turns out to be superfluous due to (9) . Considering the problem from the decoder side, one needs to determine gains a, b, such that the residual signals s, t in the up-mix

have minimal energy. The solution is given by where
The same problem, with the additional restriction that the coefficients a,b are real, has the solution given by taking
the real part of (7) and inserting it in (6). In this case, p can be expressed in terms of the PS parameters c,p, as
follows:
By inserting (6) into (5) and adding the two equations in (5) it follows that:

Describing the up-mixing process in the usual matrix nota¬tion, the up mixing can be represented by a rotator matrix H as follows:

In the case where g is not limited by g0 in (4), a differ¬ent representation of the optimal coefficients a, b is given by:

The first column of the rotator matrix H is identical to the amplitude rotator used in parametric stereo, that is for example derived in WO 03/090206 Al.
The downmix needs to be compatible with the up mix in the sense that perfect reconstruction is obtained when all lossy coding steps are omitted. As a consequence the down-mixing matrix D,

must be the inverse of the upmix rotator H. An elementary computation yields

where the first row is consistent with (2).
There is a stability problem with the two optimal rotators given by (10) and (13). As (c,p) approaches (1,-1), the
value of p given by (8) diverges. Therefore one has to de¬viate from the optimal rotators in a neighborhood of this point of the PS parameter domain. The solution taught by the present invention is to modify the PS parameters by an instability limiter both in the encoder and in the decoder.

In its general form, such a limiter will alter the values of the pair (c,p) in a neighborhood of (1,-1) in order to
achieve a bounded range for p . A particularly attractive solution is based on the observation that the denominator of (8) is the same as that of (4) . The inventive solution keeps c unaltered and modifies p exactly when the adaptive downmix gain g is limited by g0 in (4) . This occurs when

The preferred modification of p performed by the instabil¬ity limiter 14 is then:

The corresponding value of p given by inserting pin place of pin (8) has the property that

In the previous paragraphs, the problem analysis leading to the definition of the limiter 14 has been detailed. Al¬though the notation is based on stereo signals, it is clear that the same method can be applied on any pair of audio signals, such as channel pairs selected from or generated by a partial downmix of a multi-channel audio signal. Par¬ticularly advantageous is, that the same limiting rule can be used to limit the parameters within the up-mixing and the down-mixing matrix.
Fig. 2 describes the inventive audio encoding procedure us¬ing a block diagram, showing how the audio encoding is per¬formed when following the inventive concept. In a first pa¬rameter extraction step 30, the ICC and IID parameters are derived.

These parameters are then forwarded as output 23 and trans-ferred to serve as input for the limiting step 32, where a comparison of the ICC parameter with a computed minimal ICC parameter ICCmin is made, wherein ICCmin is depending on IID. In a first case, where the ICC parameter excedes the mini¬mum ICC parameter ICCmin (IID) , the ICC parameter is directly forwarded to the down-mixing step 34.
If the ICC parameter does not exceed ICCmin(HD), an addi¬tional exchange step 36 is performed, where the value of the ICC parameter is replaced by the value of the minimal ICC parameter ICCmin(HD). After the exchange step 36, the ICC parameter having the new value is then transferred to the down-mixing step 34.
In the down-mixing step 34 the downmix signal 20 and the residual signal 22 are derived from the channels 1 and r, depending on the parameters ICC and IID.
Finally the parameters 23 (ICC and IID) , the downmix sig¬nal 20 and the residual signal 22 are available as output of the encoding procedure.
Fig. 3 shows another embodiment of an inventive audio en¬coding device 50 that comprises an audio encoder 10, a sig¬nal processing unit 51 having a first audio compressor 52, a second audio compressor 54, and a parameter compres¬sor 56, and an output interface 58.
The components of the audio encoder 10 have already been discussed in the previous paragraphs. Therefore, only those parts of the audio encoding device 50 that are extending the audio encoder 10 will be discussed in the following paragraphs.
The general purpose of the signal processing unit 51 is to compress the downmix signal 20, the residual signal 22 and

the parameters 23. Therefore, the downmix signal 20 is in¬put into the first audio compressor 52, the residual sig¬nal 22 is input into the second audio compressor 54 and the spatial parameters 23 are input into the parameter compres¬sor 56. The first audio compressor 52 derives a first audio bit stream 60, the second audio compressor 54 derives a second audio bit stream 62 and the parameter compressor 56 derives a parameter bit stream 64. The first and the second audio bit stream (60, 62) and the parameter bit stream 64 are then used as input of the output interface, that com¬bines the three bit streams (60, 62, 64) to derive a com¬bined bit stream 66, which is the output of the inventive encoding device 50.
The combination performed by the output interface 58 could for example be a simple multiplexing of the three incoming bit streams. Furthermore, any kind of combination that leads to a single output bit stream 66 is possible. Dealing with a single bit stream is much more convenient in han¬dling, such as streaming via the internet or other data links.
In other words, Figure 3 illustrates an encoder that takes a two-channel audio signal, comprising the channels 1, r as input and generates a bitstream that permits decoding by a parametric stereo decoder. The adaptive downmix takes the two-channel signal 1, r and generates a mono downmix m and a residual signal s. These signals can then be encoded by perceptual audio encoders to produce compact audio bit-streams. The parametric stereo (PS) parameter estimation takes the two-channel signal 1, r as input and generates a set of PS parameters. The instability limiter modifies the PS parameters which control the adaptive downmix. The en-coding block produces the parametric stereo side informa¬tion (PS sideinfo) from the unmodified output of the PS pa¬rameter estimation. The multiplexer combines all encoded data to form the combined bitstream.

It is one of the major advantages of the inventive coding concept, that it is fully backwards compatible to prior art parametric stereo decoders. To illustrate this, Fig. 4 shows a prior art parametric stereo decoder.
The parametric stereo decoder 70 comprises an input inter¬face 72, an audio decoder 74, a parameter decoder 7 6, and an up-mixer 78.
The input interface 72 receives a combined bit stream 80 as produced from by inventive audio encoder 50. The input in¬terface 72 of the prior art parametric stereo decoder 70 does not recognize the residual signal 22 and therefore only extracts the downmix signal 60 (first audio bit stream 60 from Fig. 3) and the parameter bit stream 64 from the input bit stream 80. The audio decoder 74 is the com¬plementary device to the first audio compressor 52 and the parameter decoder 76 is the complementary device to the pa¬rameter compressor 56. Therefore, the audio bit stream 60 is decoded into the downmix signal 20 and the parameter bit stream 64 is decoded to the spatial parameters 23. Since the spatial parameters 23 have been directly transferred and not been further processed by the inventive encoder 10 or 50, a prior art up-mixer 78 can reconstruct a left and a right channel, building an output signal 80 from the down-mix signal 20 using the spatial parameters 23.
In other words, Figure 4 illustrates a parametric stereo decoder that takes a compatible bitstream as generated by an inventive encoding device 50 as input and generates the stereo audio signal comprising the channels 1 and r, with¬out using or without having access to the part of the bit-stream that describes the residual signal. First a demulti¬plexer takes the compatible bitstream as input and decom¬poses it into one audio bitstreams and the PS sideinfo. The perceptual audio decoder produces a mono signal m, and the PS sideinfo is decoded into PS parameters. The PS synthesis converts the mono signal into left and right signals 1 and

r in accordance with the PS-parameters, in particular by adding a decorrelated signal in order to retain the channel correlation of the original stereo channels
Fig. 5 shows an inventive multi-channel-audio encoder 100 that encodes a 6-channel audio signal into a stereo downmix and a number of parameter sets.
The multi-channel audio encoder 100 comprises a first adap¬tive encoder 102, a second adaptive encoder 104, estimation module 106, a parameter extractor 108, and a 3 to 2 down-mixer 110.
The first adaptive encoder 102 and the second adaptive en¬coder 104 are embodiments of an inventive encoder 10. The 6 channel input signal is having a left front channel 112a, a left rear channel 112b, a right front channel 114a, a right rear channel 114b, a center channel 116a, and a low fre¬quency enhancement channel 116b. The left front chan¬nel 112a and the left rear channel 112b are input into the first adaptive encoder 102 that derives a first downmix signal 118a, the corresponding residual signal 118b and spatial parameters 118c. The right front channel 114a and the right rear channel 114b are input into the second adap-tive encoder 104, that derives a second downmix sig¬nal 120a, the corresponding residual signal 120b, and the underlying spatial parameters 120c. The center channel 116a and the low frequency enhancement channel 116b are input into the summation module 106, that adds the signals to create a mono signal 122a and corresponding spatial parame¬ters 122b.
The 3 to 2 down-mixer 110 receives the downmix sig¬nals 118a, 120a, and 122a to down-mix them into a stereo output signal 124 having a left and a right channel. The 3 to 2 down-mixer additionally derives a residual signal 126 from the input channels 118a, 120a, and 122a. Furthermore,

the 3 to 2 down-mixer 110 derives a parameter set 128 from the parameter sets 118b, 120b, and 122b.
Summarizing shortly, Fig. 5 illustrates a part of a spatial audio encoder that takes as input a multi-channel audio signal in 5.1 format, comprising the channels Lf (left front), Lr (left surround), Rf (right front), Rr (right surround), C (centre) and LFE (low-frequency efficient), and that creates a stereo down-mix, comprising L0 and R0, and a number of parameter sets. Not shown in this figure are time to frequency transforms, coding of the down-mix signals and parameters, and multiplexing the coded informa¬tion into a bit-stream which can be decoded by a corre¬sponding spatial audio decoder. The adaptive down-mix takes as input the signals Lf and Lr and produces a mono signal L and a residual signal L. The parametric stereo (PS) parame¬ter estimation takes the two-channel signal Lf and Lr as input and generates a set of PS parameters. The instability limiter modifies the PS parameters that control the adap¬tive down-mix. In a similar manner, the adaptive down-mix takes as input the signals Rf and Rr and produces a mono signal R and a residual signal R. The parametric stereo (PS) parameter estimation takes the two-channel signal Rf and Rr as input and generates a set of PS parameters. The instability limiter modifies the PS parameters that control the adaptive down-mix. The summation module adds the sig¬nals C and LFE to create a mono signal C. The parametric stereo (PS) parameter estimation takes the two-channel sig¬nal C and LFE as input and generates a set of IID parame¬ters, a subset of PS parameters. The mono signals L, R and C are mixed to a stereo signal (Lo and Ro) and a residual signal Eo by the 3 to 2 module. The 3 to 2 module also out-puts a parameter set {Lo, Ro}.
Fig. 6 describes an inventive audio decoder 140, comprising an up-mixer 142, and a limiter 144.

The inventive decoder 140 receives a downmix signal 146, a residual signal 148 and spatial parameters 150. The downmix signal 146 and the residual signal 148 are input into the up-mixer 142, whereas the spatial parameters 150 are input into the limiter 144. The limiter 144 limits the spatial parameters 150 to derive limited spatial parameters 152.
It is important to note, that the limiter is using the same limiting rule to derive the limited parameters as the cor¬responding encoder during the encoding process. The limited parameters are used to control the up-mixing process in the up-mixer 142 that derives a stereo signal 154 having a left and a right channel from the downmix signal 146 and the re¬sidual signal 148.
Fig. 7 shows a block diagram illustrating the principle of an inventive decoder. In a first limiting step 160 the re¬ceived spatial parameters ICC and IID are limited. That is, it is checked whether the received ICC parameter exceeds a minimum ICC parameter ICCmin(IID) . If this is the case, the spatial parameters 150 (ICC and IID), a received downmix signal 146, and a received residual signal 148 are trans¬mitted to the up-mixing step 162. If the ICC parameter does not exceed the minimum ICC parameter ICCmin(HD), a limiting step 164 is additionally performed, where the value of the ICC parameter is exchanged by the value of the parameter ICCmin(HD), having the effect, that the value of ICCmin(HD) is transmitted to the up-mixing step 162.
In the up-mixing step 162, a stereo signal 154 having a left and a right channel is derived from the downmix sig¬nal 146 and the residual signal 148, using the spatial pa¬rameters ICC and IID.
Fig. 8 shows a further embodiment of an inventive decoding device 180 that comprises a decoder 140, a signal-processing unit 182 having a first audio decoder 184, a second audio decoder 186 and a parameter decoder 188. The

decoding device 180 further comprises an input inter¬face 190 for receiving a combined bit stream 192, that is generated by an inventive encoding device 50.
The combined bit stream 192 is decoposed by the input in-terface 190 to a first audio bit stream 194a, a second au¬dio bit stream 194b and a parameter bit stream 196.
The first audio bit stream 194a is input into the first au¬dio decoder 185, the second audio bit stream 194b is input into the second audio decoder 186, and the parameter bit stream 196 is input into the parameter decoder 188. The de¬compressed downmix signal 198 (m) and the residual sig¬nal 200 (s) are input into the up-mixer 142 of the de¬coder 140. Spatial parameters 202 derived by the parameter decoder 188 are input into the limiter 144 of the audio de¬coder 140. The limiting of the spatial parameters and the up-mixing have already been described within the descrip¬tion of the audio decoder 140. A detailed description can be obtained from the corresponding paragraphs of the de¬scription of Fig. 6.
The inventive decoding device 180 finally outputs a stereo signal 204, having a left and a right channel.
In other words, fig. 8 illustrates a parametric stereo de¬coder that takes a compatible bitstream as input and gener¬ates the stereo audio signal comprising the channels 1 and r. First a demultiplexer takes the compatible bit stream as input and decomposes it into two audio bit streams and the PS side info. Perceptual audio decoders produce a mono sig¬nal m and a residual signal s respectively, and the PS side info is decoded into PS parameters by the parameter de¬coder. The instability limiter modifies the PS parameters. The up-mixer converts the mono and residual signals into left and right signals 1 and r by means of a rotation ma¬trix defined from the PS parameters modified by the insta¬bility limiter.

Fig. 9 shows an inventive multi-channel audio decoder 210 comprising a first two-channel decoder 212, a second two-channel decoder 214, a synthesis module 216, and a 2 to 3 module 218.
Figure 9 illustrates part of a spatial audio decoder that takes as input a stereo audio signal (comprising the Lo and Ro) , a residual signal Eo and a parameter set {Lo, Ro}. The 2 to 3 module 218 produces three audio chan¬nels L, R, and C from the above-mentioned input. The mono channel L and the residual channel L are converted by a first two-channel decoder 211 into the Lf and Lr output signals. The instability limiter modifies the PS parameter set L. Similarly, the mono channel R and the residual channel R are converted by a second two-channel decoder 214 into the Rf and Rr output signals. The instability limiter is the same as used during the generation of the mono channel R and modifies the PS parameter set R. The PS synthesis module 216 takes the mono channel C and parame¬ter set C and generates the C and LFE output channels.
Fig. 10 and 11 show an alternative solution for an encoder and a decoder avoiding the instability problem. The alter¬native is based on using the limited spatial parameters as the parameters to be encoded and transmitted. This can be seen in the inventive encoder in Fig. 10 that is based on the inventive encoding device of Fig. 3.
Fig. 10 shows a modification of an inventive encoder al-ready shown in Fig. 3, with the difference, that the pa-rameters fed into the parameter encoder 56 are taken at a point 300, i.e. after the limiting process. That is, the limited parameters are encoded and transmitted instead of the original parameters.

On the decoder side shown in Fig. 11, the modification that the limiter can be omitted compared to the decoding device 180. Therefore, the decoded spatial parameter 310 is input directly into the up-mixer 142 to derive the stereo sig¬nal 204.
The disadvantages of this solution compared to the place¬ment of instability limiters as taught before and shown in the previous figures are twofold. First, the quantization of the limited parameters would move the rotators further away from the optimality then necessary. The size of the residual therefore would be larger in general, leading to a loss in encoding gain for the residual coding method. Sec¬ond, backwards compatibility to parametric-stereo decoding would be lost. In critical cases, when the channel correla¬tion of the original channel is negative, the decoder would not be able to reproduce this correlation without access to the residual signal.
Fig. 12 is showing an inventive audio transmitter or re¬corder 330 that is having an audio encoder 50, an input in¬terface 332 and an output interface 334.
An audio signal can be supplied at the input interface 332 of the transmitter/recorder 330. The audio signal is en¬coded by an inventive encoder 50 within the transmit¬ter/recorder and the encoded representation is output at the output interface 334 of the transmitter/recorder 330. The encoded representation may then be transmitted or stored on a storage medium.
Fig. 13 shows an inventive receiver or audio player 340, having an inventive audio decoder 180, a bit stream in-put 342, and an audio output 344.
A bit stream can be input at the input 342 of the inventive receiver/audio player 340. The bit stream then is decoded by the decoder 180 and the decoded signal is output or

played at the output 34 4 of the inventive receiver/audio player 340.
Fig. 14 shows a transmission system comprising an inventive transmitter 330, and an inventive receiver 340.
The audio signal input at the input interface 332 of the transmitter 330 is encoded and transferred from the out¬put 334 of the transmitter 330 to the input 342 of the re¬ceiver 340. The receiver decodes the audio signal and plays back or outputs the audio signal on its output 344.
The above-mentioned and described embodiments of the pre¬sent invention are merely illustrative for the principles of the present invention for the improvement of adaptive residual coding. It is understood that modifications and variations of the arrangements and details described herein will be operand to others skilled in the art. It is the in¬tent, therefore, to be limited only by the scope of the im¬pending patent claims and not by the specific details pre¬sented by way of description and explanation of the embodi¬ments herein.
Although the embodiments of the present invention described in the figures above are described using mainly a nomencla¬ture used for stereo signals, it is apparent that the pre¬sent invention is not limited to stereo signals but could be applied to any other kind of combination of two audio signals, as for example done within the multi-channel audio encoders and decoders shown in Fig. 5 and Fig. 9.
Using an inventive transmission system having a transmitter and a receiver, the transmission between the transmitter and the receiver can be achieved by various means. This can be for example life streaming over the internet or other network media, storing a file on a computer readable media and transferring the media, directly connecting the trans¬mitter and the receiver by cable or wireless such as wire-

less LAN or Bluetooth and any other imaginable data connec-tion.
Although it has been described in detail, that the ICC pa-rameter only is to be changed to assure a non-diverging up-and downmix matrix, it is also possible to limit both the IID and IIC parameters such that no divergence will occur. More generally, applying the inventive concept can also mean deriving other spatial parameters and applying a lim¬iting rule to these parameters, assuring for a non-diverging down- and up-mix.
The output and input interfaces in the inventive encoders and decoders are not limited to simple multiplexers or de-multiplexers only. In a more sophisticated variation, the output interface may combine the bit streams not by just multiplexing them but by any other means, possibly even by trying some further entropy coding to reduce the size of the bit stream.
Depending on certain implementation requirements of the in-ventive methods, the inventive methods can be implemented in hardware or in software. The implementation can be per¬formed using a digital storage medium, in particular a disk, DVD or a CD having electronically readable control signals stored thereon, which cooperate with a programmable computer system such that the inventive methods are per¬formed. Generally, the present invention is, therefore, a computer program product with a program code stored on a machine-readable carrier, the program code being operative for performing the inventive methods when the computer pro¬gram product runs on a computer. In other words, the inven¬tive methods are, therefore, a computer program having a program code for performing at least one of the inventive methods when the computer program runs on a computer.
While the foregoing has been particularly shown and de¬scribed with reference to particular embodiments thereof,

it will be understood by those skilled in the art that
various other changes in the form and details may be made
without departing from the spirit and scope thereof. It is
to be understood that various changes may be made in adapt¬
ing to different embodiments without departing from the
broader concepts disclosed herein and comprehended by the
claims that follow. ,

We Claim:
1. Audio encoder (10) for encoding an audio signal having at least two
channels, comprising:
a parameter extractor (16) for deriving a coherence parameter (ICC)
describing a coherence between a first channel and second channel of the
at least two channels and a level parameter (IID) describing a level
difference between the first channel and the second channel as spatial
parameters;
a limiter (14) for limiting the coherence parameter to derive a limited
coherence parameter,wherein the limit of the coherence parameter
depends on the level parameter and on a scaling factor;
a down-mixer (12) for deriving a downmix signal (20) and a residual
signal (18) from the audio signal using a down-mixing rule depending on
the limited coherence parameter.
2. The audio encoder (10) as claimed in claim 1, wherein the parameter
extractor (16) is operative to derive multiple spatial parameters for a
given time portion of the audio signal.

3. The audio encoder (10) as claimed in claim 1 or 2, wherein the limiter (14) is operative to limit the coherence parameter such that a ratio of intensities between the down-mix signal (20) and the at least two channels does not exceed a predefined limit.
4. The audio encoder (10) as. claimed in claim 4, wherein the predefined gain factor is selected from the interval [1, 2].
5. The audio encoder (10) as claimed in any of claims 1 to 4, wherein the down-mixer (12) is operative to use a down-mixing rule such that the downmix signal (20) and the residual signal (18) are derived by forming a linear combination of the channels from the at least two channels, wherein the coefficients of the linear combination are depending on the limited coherence parameter.
5. The audio encoder (10) as claimed in any claims 1 to 5, comprising a
signal processing unit (51) for processing or transmitting the downmix
signal (20), the residual signal (18), and the spatial parameters to derive a
processed downmix signal, a processed residual signal, and processed
parameters.

7. Audio encoder (10) as claimed in claim 6, wherein the signal processing unit (51) is operative to derive the processed downmix signal, the processed residual signal, and the processed parameters such that the deriving includes a compression of the downmix signal (20), the residual signal (18), and the spatial parameters.
8. The audio encoder (10) as claimed in claim 7 or 8, comprising an output interface (58) for providing the information of the processed downmix signal (20), the processed residual signal (18), and the processed spatial parameters.
9. The audio encoder (10) as claimed in claim 8, wherein the output interface (58) is operative to combine the processed downmix signal, the processed residual signal,and the processed parameters to derive an out-put bit stream having the information of the processed downmix signal, the processed residual signal and the processed parameters.
10. The audio encoder (10) as claimed in claim 9, wherein the output interface
(58) is operative to multiplex the processed downmix signal, the processed
residual signal, and the processed parameters to derive the output bit stream.
ll.The audio encoder (10) as claimed in any of claims 1 to 10, wherein multiple
pairs of channels are encoded, and wherein for each pair of channels a spatial
parameter, a downmix signal (20) and a residual signal (18) is derived.

12. The audio encoder (10) as claimed in claim 11, wherein the multiple pairs of channels comprise a left front, a left rear, a right front, a right rear, a low frequency enhancement and a center channel.
13. An audio decoder (140) for decoding an encoded audio signal representing an original audio signal having at least two channels, the encoded audio signal having a downmix signal, and a residual signal as well as a coherence parameter (ICC) describing a coherence between a first and a second channel of the at least two channels and a level parameter (IID) describing a level difference between the first and the second channel as spatial parameters , comprising:
a limiter (144) for limiting the coherence parameter to derive a limited coherence parameter, wherein the limit of the coherence parameter depends on the level parameter and on a scaling factor; and
an up-mixer (142) for deriving a reconstruction of the original audio signal (154) from the downmix signal and the residual signal using an up-mixing rule depending on the limited coherence parameter.
14. The audio decoder (140) as claimed in claim 13, wherein the limiter (144) is
operative to limit multiple coherence parameters for a given time portion of the
encoded audio signal corresponding to a time frame of the original audio signal.

15. The audio decoder (140) as claimed in claim 13 or 14, wherein the limiter (144) is operative to limit the coherence parameter such that a ratio of intensities between the downmix signal and the at least two channels of the original audio signal does not exceed a predefined limit..
16. The audio decoder (140) as claimed in claim 13, wherein the predefined gain factor is chosen from the interval [1, 2].
17. The audio decoder (140) as claimed in any of claims 13 to IB, wherein the up-mixer (142) is operative to use an up-mixing rule such that a first reconstructed channel and a second reconstructed channel of the at least two channels are derived by forming a linear combination of the downmix signal and the residual signal, wherein the coefficients of the linear combination are depending on the limited coherence parameter.
18. The audio decoder (140) as claimed in any of claims 13 to 17, comprising a signal processing unit (182) for transmitting or processing a processed residual signal, a processed downmix signal, and processed parameters to derive the residual signal, the downmix signal, and the spatial parameters.

19. The audio decoder (140) as claimed in claim 18, wherein the signal processing unit (182) is operative to derive the residual signal, the downmix signal, and the spatial parameter such that the deriving of the residual signal, the downmix signal and the spatial parameters includes decompression of the processed residual signal, the processed downmix signal, and the processed spatial parameters.
20 The audio decoder (140) as claimed in claims 18 or 19, comprising an input interface (190) for providing the processed residual signal, the processed downmix signal and the processed spatial parameters.
21. The audio decoder (140) as claimed in claim 20, wherein the input interface (190) is operative to decompose a single input bit stream to derive the processed residual signal, the processed downmix signal and the processed parameters.
22. The audio decoder (140) as claimed in claim 21, wherein the input interface (190) is operative to decompose the single input bit stream such that the deriving of the processed residual signal, the processed downmix signal and the processed parameters includes a de-multiplexing of the input bit stream.
23. Method for encoding an audio signal having at least two channels, the method comprising:

deriving a coherence parameter (ICC) describing a coherence between a first and a second channel of the at least two channels and a level parameter (IDD) describing a level difference between the first and the second channel as spatial parameters;
limiting the coherence parameter to derive a limited coherence parameter, wherein the limit of the coherence parameter depends on the level parameter and on a scaling factor; and
deriving a downmix signal and a residual signal from the audio signal using a down-mixing rule depending on the limited coherence parameter.
24. A method for decoding an encoded audio signal representing an original audio signal having at least two channels, the encoded audio signal having a downmix signal and a residual signal as well as a coherence parameter (ICC) describing a coherence between a first and a second channel of the at least two channels and a level parameter (IID) describing a level difference between the first and second channel as spatial parameters , the method comprising: limiting the coherence parameter to derive a limited coherence parameter, wherein a limit of the coherence parameter depends on the level parameter and

on a scaling factor ;and
deriving a reconstruction of the original audio signal from the downmix signal and the residual signal using an up-mixing rule depending on the limited coherence parameter.

ABSTRACT ADAPTIVE RESIDUAL AUDIO CODING

This invention relates to audio encoder(lO) for encoding an audio signal having at least two channels (18) comprising a parameter extractor (16) for deriving a coherence parameter (ICC) describing a coherence between a first channel and a second channel of the at least two channels and a level parameter (IID) describing a level difference between the first channel and the second channel as spatial parameters; a limiter (14) for limiting the coherence parameter to derive a limited coherence parameter, wherein the limit of the coherence parameter depends on the level parameter and on a scaling factor; a down-mixer (12) for deriving a downmix signal (20) and a residual signal (18) from the audio signal using a down-mixing rule depending on the limited coherence parameter.

ADAPTIVE RESIDUAL AUDIO CODING

Documents:

Inventors:

PCT Conventions: