Title of Invention

VIDEO ENCODING AND DECODING METHODS AND CORRESPONDING DEVICES

Abstract The invention relates to the field of video compression and, more specifically, to a video encoding method applied to an input sequence of frames in which each frame is subdivided into blocks of arbitrary size. This method comprises, for at least a part of said blocks of the current frame, the steps of generating on a block basis motion-compensated frames, each one being obtained from each current original frame and a previous reconstructed frame, generating from said motion-compensated frames residual signals, using a so-called matching pursuit (MP) algorithm for decomposing each of said generated residual signals into coded dictionary functions called atoms, the other blocks of the current frame being processed by means of other coding techniques, and coding said atoms and the motion vectors determined during the motion compensation step, for generating an output coded bitstream. According to the invention, said method is such that, when using said MP algorithm, a specific dictionary is available at the encoding side for each block shape respectively. According to another implementation, it is also possible to use several specific dictionaries. In this second solution, if several dictionaries are available at the encoding side, a bitstream syntax is defined for placing, at a predetermined level, flags provided to indicate which dictionary should be used and placed for example at the atom level, at the block level, at the macroblock level or at the picture level.
Full Text

WO 2005/015501

PCT/IB2004/002478

"VIDEO ENCODING AND DECODING METHODS AND CORRESPONDING DEVICES"
FIELD OF THE INVENTION
The present invention generally relates to the field of video compression and, for instance, more particularly to the video standards of the MPEG family (MPEG-1, MPEG-2, MPEG-4) and to the video coding recommendations of the ITU H26X family (H.261, H.263 and extensions). More specifically, the invention relates to a video encoding method applied to an input sequence of frames in which each frame is subdivided into blocks of arbitrary size, said method comprising for at least a part of said blocks of the current frame the steps of:
- generating on a block basis motion-compensated frames, each one being obtained from each current original frame and a previous reconstructed frame ;
- generating from said motion-compensated frames residual signals ;
- using a so-called matching pursuit (MP) algorithm for decomposing each of said generated residual signals into coded dictionary functions called atoms, the other blocks of the current frame being processed by means of other coding techniques ;
- coding said atoms and the motion vectors determined during the motion compensation step, for generating an output coded bitstream.
BACKGROUND OF THE INVENTION
In the current video standards (up to the video coding MPEG-4 standard and H.264 recommendation), the video, described in terms of one luminance channel and two chrominance ones, can be compressed thanks to two coding modes applied to each channel : the "intra" mode, exploiting in a given channel the spatial redundancy of the pixels (picture elements) within each image, and the "inter" mode, exploiting the temporal redundancy between separate images (or frames). The inter mode, relying on a motion compensation operation, allows to describe an image from one (or more) previously decoded image(s) by encoding the motion of the pixels from one (or more) image(s) to another one. Usually, the current image to be coded is partitioned into independent blocks (for instance, of size 8 x 8 or 16 x 16 pixels in MPEG-4, or of size 4 x 4, 4 x 8, 8 x 4, 8 x 8, 8x16,16x8 and 16 x 16 in H.264), each of them being assigned a motion vector (the three channels share such a motion description). A prediction of said image can then be constructed by displacing pixel blocks from a reference image according to the set of

motion vectors associated to each block. Finally, the difference, or residual signal, between the current image to be encoded and its motion-compensated prediction can be encoded in the intra mode (with 8x8 discrete cosine transforms - or DCTs - for MPEG-4, or 4 x 4 DCTs for H.264 in the main level profile).
The DCT is probably the most widely used transform, because it offers a good compression efficiency in a wide variety of coding situations, especially at medium and high bitrates. However, at low bitrates, the hybrid motion compensated DCT structure may be not able to deliver an artefact-free sequence for two reasons. First, the structure of the motion-compensated inter prediction grid becomes visible, with blocking artifacts. Moreover, the block edges of the DCT basis functions become visible in the image grid, because too few coefficients are quantized - and too coarsely - to make up for these blocking artifacts and to reconstruct smooth objects in the image.
The document "Very low bit-rate video coding based on matching pursuits", R.Neff and A. Zakhor, IEEE Transactions on Circuits and Systems for Video Technology, vol.7, n°l, February 1997, pp.158-171, describes a new motion-compensated system including a video compression algorithm based on the so-called matching pursuit (MP) algorithm, a technique developed about ten years ago (see the document "Matching pursuits with time-frequency dictionaries", S.G.Mallat and Z.Zhang, IEEE Transactions on Signal Processing, vol.41, n°12, December 1993, pp.3397-3414). Said technique provides a way to iteratively decompose any function or signal (for example, image, video,...) into a linear expansion of waveforms belonging to a redundant dictionary of basis functions, well localized both in time and frequency and called atoms. A general family of time-frequency atoms can be created by scaling, translating and modulating a single function g(t) € L2(R) supposed to be real and continuously differentiable. These dictionary functions may be designated by:
gy(t) έ G (G = dictionary set), (1)
7 (= gamma) being an indexing parameter associated to each particular dictionary element (or atom). As described in the first cited document, assuming that the functions gy{t) have
unit norm, i.e. =l, the decomposition of a one-dimensional time signal f(t)
begins by choosing y to maximize the absolute value of the following inner product:
p = , (2)






To obtain this parameter set, a training set of motion residual images was decomposed using a dictionary derived from a much larger set of parameter triples. The dictionary elements which were most often matched to the training images were retained in the reduced set. The obtained dictionary was specifically designed so that atoms can freely match the structure of motion residual image when their influence is not confined to the boundaries of the block they lie in (see Fig.2 shoving the example of an atom placed in a block-divided image without block-restrictions).

However, it has been recently proposed, in a European patent application filed on August 5, 2003, by the applicant with the number EP03300081.1 (PHFR030085), a hybrid motion-compensated coding system using atoms that are confined to block boundaries, as depicted in Fig.3. More precisely, the invention described and claimed in said patent application mainly relates to a video encoding method applied to an input sequence of frames in which each frame is subdivided into blocks of arbitrary size, said method comprising for at least a part of said blocks of the current frame the steps of:
- generating on a block basis motion-compensated frames, each one being
obtained from each current original frame and a previous reconstructed frame;
- generating from said motion-compensated frames residual signals ;
-using a so-called matching pursuit (MP) algorithm for decomposing each of
said generated residual signals into coded dictionary functions called atoms, the other blocks of the current frame being processed by means of other coding techniques (the words "at least a part of said blocks" used above mean that some blocks or all the blocks are concerned by the implementation of the invention, the other ones being processed by these other techniques, which justifies the fact that the coding system is called "hybrid");
- coding said atoms and the motion vectors determined during the motion
compensation step, for generating an output coded bitstream ;
said method being such that, when using said MP algorithm, any atom acts only on one block B at a time, said block-restriction leading to the fact that the reconstruction of a
residual signal f is obtained from a dictionary that is composed of basis functions g
restricted to the block B corresponding to the indexing parameter yR, according to the following 2D spatial domain operation :

The main interest of this previous approach is to better model the blocky structure of residual signals, to augment the dictionary diversity for the same coding cost and to offer the possibility of alternating MP and DCT transforms since there is no interference across block boundaries (it also avoids the need to resort to overlapped motion compensation to limit blocking artefacts). The main elements useful to understand this previous implementation are recalled with reference to Figs 4 to 7.

A simplified block diagram of a video encoding device implementing a hybrid video coder using multiple coding engines is shown in Fig.4. Several coding engines implement predetermined coding techniques, for instance a coding engine 41 can implement the intra-DCT coding method, a second one 42 the inter-DCT coding method, and a third one 43 the matching pursuit algorithm. Each frame of the input video sequence is received ("video signal") by a block partitioner device 44, which partitions the image into individual blocks of varying size, and decides which coding engine will process the current original block. The decisions representing the block position, its size and the selected coding engine is then inserted into the bitstream by a coding device 45. The current original signal block is then transferred to the selected coding engine (the engine 43 in the situation illustrated in Fig.4).
A matching pursuit coding engine is illustrated in Fig.5. Each of the original signal blocks of the input video sequence assigned to the coding engine 43 is received on one side by motion compensating means 51 for determining motion vectors (said motion vectors are conventionally found using the block matching algorithm), and the vectors thus obtained are coded by motion vector coding means 52, the coded vectors being delivered to a multiplexer 53 (referenced, but not shown). On the other side, a subtracter 54 delivers on its output the residual signal between the current image and its prediction. Said residual signal is then decomposed into atoms (the dictionary of atoms is referenced 57) and the atom parameters thus determined (module 55) are coded (module 56). The coded motion vectors and atom parameters then form a bitstream that is sent to match a predefined condition for each frame of the sequence.
The encoding engine 43 carries out a method of coding an input bitstream that comprises the following steps. First, as in most coding structures, the original frames of the input sequence are motion-compensated (each one is motion-compensated on the basis of the previous reconstructed frame, and the motion vectors determined during said motion-compensated step are stored in view of their later transmission). Residual signals are then generated by difference between the current frame and the associated motion-compensated prediction. Each of said residual signals is then compared with a dictionary of functions consisting of a collection of 2D separable Gabor functions, in order to generate a dictionary structure g (t) specified by the indexing parameter γ an expansion coefficient
p(n) and a residual Rn(t) - p. gy (t) which is passed on to the next stage of this iterative procedure. Once the atom parameters are found, they can be coded (together with the



decoded atom parameters to an atom device 72 (the dictionary of atoms is referenced 73) which reconstructs the matching pursuit functions at the decoded position within the assigned video block to form the decoded residual signal. The entropy decoder device also outputs motion vectors which are fed into a motion compensation device 74 to form a motion prediction signal from previously reconstructed video signals. The motion prediction and the reconstructed residual signal are then summed in an adder 75 to produce a video signal reconstructed block.
The interest of the previous approach, recalled above in a detailed manner, resides in the fact that because a single atom cannot span several blocks, it does not have to deal with the high-frequency discontinuities at block edges. Instead, it can be adapted to block boundaries, and even to block sizes, by designing block-size dependent dictionaries. Moreover, since overlapped motion compensation is no longer mandatory to preserve the MP efficiency, classical motion compensation may be used. However, with such an approach, it is not sure that the dictionary is well adapted to the structure of the signal to be modelled, when its atoms are confined in arbitrarily sized blocks.
SUMMARY OF THE INVENTION
It is therefore an object of the invention to propose a video encoding method based on matching pursuit algorithm and solving the above-indicated problem of adaptation.
To this end, the invention relates to a video encoding method such as defined in the introductory part of the description and which is moreover such that, when using said MP algorithm, a specific dictionary is available at the encoding side for each block shape respectively.
In another implementation of the method according to the invention, when using said MP algorithm, several dictionaries are available at the encoding side, and a bitstream syntax is defined for placing at a predetermined level flags provided to indicate which dictionary should be used.
It is another object of the invention to propose video encoding devices allowing to carry out these two implementations of the method according to the invention.
It is still an object of the invention to propose video decoding methods and devices allowing to decode signals coded by means of said video encoding methods and devices.

BRIEF DESCRIPTION OF THE DRAWINGS
The present invention will now be described, by way of example, with reference to the accompanying drawing in which :
- Fig.l allows a visualization of the 400 basis functions of the 2D Gabor dictionary used in the implementation of the matching pursuit algorithm ;
- Fig.2 illustrates the example of an atom placed in a block-divided image without block-restrictions;
- Fig.3 illustrates the case of a block-restricted matching pursuit residual coding, with an atom being confined into the motion-compensated grid and acting only on a block at a time ;
- Fig.4 illustrates an example of hybrid video encoder ;
- Fig. 5 shows an example of a video encoding device for implementing a MP algorithm;
- Fig.6 illustrates an example of hybrid video decoder according to the invention ;
- Fig.7 shows an example of a video decoding device implementing the MP algorithm.
DETAILED DESCRIPTION OF THE INVENTION
A simplified block diagram of a video encoding device implementing a matching pursuit algorithm has been described above in relation with Fig. 5. This encoding device carries out a method of coding an input bitstream that comprises the same steps as described above :
- the original frames of the input sequence are motion-compensated ;
- residual signals are generated by difference between the current frame and the associated motion-compensated prediction;
- each of said residual signals is compared with a dictionary of functions consisting of a collection of 2D separable Gabor functions ;
- once the atom parameters are found, they can be coded (together with the motion vectors previously determined), the coded signals thus obtained forming the bitstream sent to the decoder.
The technical solution now proposed according to the invention consists in having separate dictionaries, one for each block shape respectively (4 x 4, 4 x 8, 8 x 4, 8 x 8, 8 x 16,16 x 8,16 x 16, for example): with such a rule used by the encoder, the video decoder would implicitly know which dictionary an atom refers to. According to another

implementation of the invention, the technical solution can also consists in providing several dictionaries, available at both the encoding side and decoding side, and in defining a bitstream syntax, which lets the encoder say to the decoder which dictionary should be used : for instance, the codeword MP_dictionary_l tells the decoder that the next atom will refer to the first dictionary, MP_dictionary__2 tells the decoder to switch to the second dictionary, and so on, such codewords, or flags, being placed for example at the atom level, the block level, the macroblock level or the picture level.






CLAIMS :
1. A video encoding method applied to an input sequence of frames in which each frame is subdivided into blocks of arbitrary size, said method comprising for at least a part of said blocks of the current frame the steps of:
- generating on a block basis motion-compensated frames, each one being obtained from each current original frame and a previous reconstructed frame;
- generating from said motion-compensated frames residual signals ;
- using a so-called matching pursuit (MP) algorithm for decomposing each of said generated residual signals into coded dictionary functions called atoms, the other blocks of the current frame being processed by means of other coding techniques ;
- coding said atoms and the motion vectors determined during the motion compensation step, for generating an output coded bitstream ;
said method being further characterized in that, when using said MP algorithm, a specific dictionary is available at the encoding side for each block shape respectively.
2. A video encoding method according to claim 1, characterized in that, when using said MP algorithm, several dictionaries are available at the encoding side, a bitstream syntax being defined for placing at a predetermined level flags provided to indicate which dictionary should be used.
3. A method according to claim 2, characterized in that said flags are placed at the atom level.
4. A method according to claim 2, characterized in that said flags are placed at the block level.
5. A method according to claim 2, characterized in that said flags are placed at the macroblock level.
6. A method according to claim 2, characterized in that said flags are placed at the picture level.
7. A video encoding device applied to an input sequence of frames in which each frame is subdivided into blocks of arbitrary size, said device applying to at least a part of said blocks of the current frame the following means :

- means for generating on a block basis motion-compensated frames, each one being obtained from each current original frame and a previous reconstructed frame;
- means for generating from said motion-compensated frames residual

signals ;
- means for performing a so-called matching pursuit (MP) algorithm for decomposing each of said generated residual signals into coded dictionary functions called atoms, the other blocks of the current frame being processed by means of other coding techniques;
- means for coding said atoms and the motion vectors determined during the motion compensation step, for generating an output coded bitstream ;
said device being further characterized in that, when using said MP algorithm, several dictionaries are available at the encoding side, one for each block shape. 8. A video decoding method for decoding a coded bitstream generated by implementation of a video encoding method applied to an input sequence of frames in which each block is subdivided into blocks of arbitrary size, said encoding method comprising for at least a part of said blocks of the current frame the steps of:
- generating on a block basis motion-compensated frames, each one being obtained from each current original frame and a previous reconstructed frame;
- generating from said motion-compensated frames residual signals ;
- using a so-called matching pursuit (MP) algorithm for decomposing each of said generated residual signals into coded dictionary functions called atoms, the other blocks of the current frame being processed by means of other coding techniques;
- coding said atoms and the motion vectors determined during the motion compensation step, for generating an output coded bitstream ;
a specific dictionary being available at the encoding side for each block shape respectively, said decoding method comprising the steps of:
- decoding said atoms and motion vectors;
- using the MP algorithm for reconstructing residual signals ;
- generating from said reconstructed signals and predicted signals built from the coded motion vectors output reconstituted signal corresponding to the original frames of said input sequence ;
said decoding method being further characterized in that the same dictionaries as at the encoding side are available at the decoding side, one for each block shape respectively.

9. A video decoding device for decoding a coded bitstream generated by
implementation of a video encoding method applied to an input sequence of frames
in which each block is subdivided into blocks of arbitrary size, said encoding
method comprising for at least a part of said blocks of the current frame the steps
of:
- generating on a block basis motion-compensated frames, each one being obtained from each current original frame and a previous reconstructed frame;
- generating from said motion-compensated frames residual signals ;
- using a so-called matching pursuit (MP) algorithm for decomposing each of said generated residual signals into coded dictionary functions called atoms, the other blocks of the current frame being processed by means of other coding techniques;
- coding said atoms and the motion vectors determined during the motion compensation step, for generating an output coded bitstream ;
a specific dictionary being available at the encoding side for eache block shape respectively, said decoding device applying to the concerned blocks the following means :
- means for decoding said atoms and motion vectors ;
- means for performing the MP algorithm, for reconstructing residual signals;
- means for generating from said reconstructed signals and predicted signals built from the coded motion vectors output reconstituted signal corresponding to the original frames of said input sequence ;
said decoding device being further characterized in that the same dictionaries as at the encoding side are available at the decoding side, one for each block shape respectively.
10. A video encoding device applied to an input sequence of frames in which each
frame is subdivided into blocks of arbitrary size, said device applying to at least a
part of said blocks of the current frame the following means :
- means for generating on a block basis motion-compensated frames, each one being obtained from each current original frame and a previous reconstructed frame;
- means for generating from said motion-compensated frames residual signals ;

- means for performing a so-called matching pursuit (MP) algorithm for decomposing each of said generated residual signals into coded dictionary functions called atoms, the other blocks of the current frame being processed by means of other coding techniques ;
- means for coding said atoms and the motion vectors determined during the motion compensation step, for generating an output coded bitstream ;
said device being further characterized in that, when using said MP algorithm, several dictionaries are available at the encoding side, and a bitstream syntax is defined for placing at a predetermined level flags provided to indicate which dictionary should be used.
11. A method according to claim 10, characterized in that said flags are placed at the atom level.
12. A method according to claim 10, characterized in that said flags are placed at the block level.
13. A method according to claim 10, characterized in that said flags are placed at the macroblock level.
14. A method according to claim 10, characterized in that said flags are placed at the picture level.
15. A video decoding method for decoding a coded bitstream generated by implementation of a video encoding method applied to an input sequence of frames in which each block is subdivided into blocks of arbitrary size, said encoding method comprising for at least a part of said blocks of the current frame the steps of:

- generating on a block basis motion-compensated frames, each one being obtained from each current original frame and a previous reconstructed frame;
- generating from said motion-compensated frames residual signals;
- using a so-called matching pursuit (MP) algorithm for decomposing each of said generated residual signals into coded dictionary functions called atoms, the other blocks of the current frame being processed by means of other coding techniques;
- coding said atoms and the motion vectors determined during the motion compensation step, for generating an output coded bitstream;
several dictionaries being available at the encoding side, together with a bitstream syntax defined for placing at a predetermined level flags provided to indicate which dictionary should be used, said decoding method comprising the steps of:
- decoding said atoms and motion vectors,

• using the MP algorithm for reconstructing residual signals;
- generating from said reconstructed signals and predicted signals built from the
coded motion vectors output reconstituted signal corresponding to the original frames of
said input sequence;
said decoding method being further characterized in that the same dictionaries as at the
encoding side are available at the decoding side, and a step is provided for reading the
transmitted flags and, when using the MP algorithm, selecting the corresponding
dictionary.
16. A video decoding device for decoding a coded bitstream generated by
implementation of a video encoding method applied to an input sequence of frames in
which each block is subdivided into blocks of arbitrary size, said encoding method
comprising for at least a part of said blocks of the current frame the steps
of:
- generating on a block basis motion-compensated frames, each one being obtained from each current original frame and a previous reconstructed frame;
- generating from said motion-compensated frames residual signals ;
- using a so-called matching pursuit (MP) algorithm for decomposing each of said generated residual signals into coded dictionary functions called atoms, the other blocks of the current frame being processed by means of other coding techniques ;
- coding said atoms and the motion vectors determined during the motion compensation step, for generating an output coded bitstream;
several dictionaries being available at the encoding side, together with a bitstream syntax defined for placing at a predetermined level flags provided to indicate which dictionary should be used, said decoding device applying to the concerned blocks the following means:
- means for decoding said atoms and motion vectors,
- means for performing the MP algorithm, for reconstructing residual signals ;
- means for generating from said reconstructed signals and predicted signals built from the coded motion vectors output reconstituted signal corresponding to the original frames of said input sequence ;
said decoding device being further characterized in that the same dictionaries as at the encoding side are available at the decoding side, and means are provided for reading the

0 2005/015501 PCT/IB2004/002478
17 transmitted flags and, when performing the MP algorithm, selecting the corresponding
dictionary.


Documents:

0524-chenp-2006 complete specification as granted.pdf

524-CHENP-2006 AMENDED PAGES OF SPECIFICATION 15-04-2013.pdf

524-CHENP-2006 AMENDED PAGES OF SPECIFICATION 19-11-2012.pdf

524-CHENP-2006 AMENDED CLAIMS 15-04-2013.pdf

524-CHENP-2006 AMENDED CLAIMS 19-11-2012.pdf

524-CHENP-2006 CORRESPONDENCE OTHERS 04-10-2012.pdf

524-CHENP-2006 CORRESPONDENCE OTHERS 15-04-2013.pdf

524-CHENP-2006 EXAMINATION REPORT REPLY RECEIVED 19-11-2012.pdf

524-CHENP-2006 FORM-1 19-11-2012.pdf

524-CHENP-2006 FORM-3 15-04-2013.pdf

524-CHENP-2006 FORM-3 19-11-2012.pdf

524-CHENP-2006 FORM-5 19-11-2012.pdf

524-CHENP-2006 ASSIGNMENT 14-06-2010.pdf

524-CHENP-2006 CORRESPONDENCE OTHERS 14-03-2013.pdf

524-chenp-2006 form-1 14-06-2010.pdf

524-CHENP-2006 FORM-2 14-06-2010.pdf

524-CHENP-2006 FORM-6 07-02-2008.pdf

524-CHENP-2006 FORM-6 14-06-2010.pdf

524-CHENP-2006 POWER OF ATTORNEY 03-04-2013.pdf

524-CHENP-2006 POWER OF ATTORNEY 14-06-2010.pdf

524-CHENP-2006 REQUEST FOR POST DATING 11-03-2013.pdf

524-CHENP-2006 CORRESPONDENCE OTHERS.pdf

524-CHENP-2006 FORM 13.pdf

524-CHENP-2006 FORM 18.pdf

524-CHENP-2006 FORM 3.pdf

524-CHENP-2006 FORM 6.pdf

524-chenp-2006-abstract.pdf

524-chenp-2006-claims.pdf

524-chenp-2006-correspondnece-others.pdf

524-chenp-2006-description(complete).pdf

524-chenp-2006-drawings.pdf

524-chenp-2006-form 1.pdf

524-chenp-2006-form 26.pdf

524-chenp-2006-form 3.pdf

524-chenp-2006-form 5.pdf

524-chenp-2006-pct.pdf


Patent Number 256341
Indian Patent Application Number 524/CHENP/2006
PG Journal Number 23/2013
Publication Date 07-Jun-2013
Grant Date 04-Jun-2013
Date of Filing 10-Feb-2006
Name of Patentee TRIDENT MICROSYSTEMS (FAR EAST) LTD.
Applicant Address UGLAND HOUSE, SOUTH CHURCH STREET, GRAND CAYMAN
Inventors:
# Inventor's Name Inventor's Address
1 VALENTE, STEPHANE C/O SOCIETE CIVILE SPID, 156 BOULEVARD HAUSSMANN, F--75008 PARIS, FRANCE
PCT International Classification Number G06T9/00
PCT International Application Number PCT/IB04/02478
PCT International Filing date 2004-07-14
PCT Conventions:
# PCT Application Number Date of Convention Priority Country
1 03300085.2 2003-08-12 EUROPEAN UNION
2 03300084.5 2003-08-12 EUROPEAN UNION