Title of Invention

DEVICE AND METHOD FOR ENCODING VIDEO OR FILM TYPE IMAGE

Abstract The invention relates to the processing of video signals prior to encoding or other compression operations, and, more particularly, to a method for encoding video signals corresponding to a sequence of frames each of which consists of two fields F1 and F2. The proposed method comprises the steps of receiving successive frames of an input video signal and delaying them with at least a 'two fields' duration delay, and detecting any dominance change and adjusting said delay. When a change from an F1 dominance to an F2 dominance is detected, the first field of the first F2 dominant frame is suppressed, and said delay is decreased by a quantity equal to 'one field' duration; when a change from an F2 dominance to an F1 dominance is detected, the last field of the last F2 dominant frame is repeated, and the delay is further increased by a quantity equal to 'one field' duration.
Full Text

IVffiTHOD AND DEVICE FOR ENCODING SEQUENCES OF FRAMES INCLUDING
EITHER VIDEO-TYPE OR FILM-TYPE IMAGES
FIELD OF THE INVENTION
The presmt inventioB relates to a method for encoding video signals corresponding to a sequence of &ames each of which originally consists of two fields FI and F2, and to a corresponding encoding device.
BACKGROUND OF TEIE INVENTION
In a video sequence, composed of successive interlaced pictures (or firames), each frame is constituted by apair of fields Fl and F2, as illustrated in Fig.l showing successive pairs of fields (each frame conqsrises atop field F(2ihl) (with n>0), or odd field, and a bottom field F(2n), or even field, die odd fiames being of type Fl and the even fiames of type F2) and the associated synchronization signal. When such video fields come out, for instance at a rate of 50 fields/second (25 fi:ames/second) or 60 fields/second (30 fi:ames/second), diba of a video camera or of any other type of video signal generatoi*, the video material has no field dominance (a frame is said to be Tl dominant" if it is constituted by a first field Fl followed by a second field F2, and to be "F2 dominant" if it is constituted by a field F2 foUowed by a field Fl).
The field dominance becomes relevant when transferring data in such a way that fimne boundaries must be known and preserved When the video rnatfri^] is edited at fiame boundaries, with a video recorder for example, a decision is provided for specifying if the video material is Fl dominant or F2 dominant: FigsJ and 4 respectively show, for a preexisting video material as indicated in Fig 2 the structure of a Fl dominant video material and of a F2 dcmiinant video material. Once some material has acquired a particular chrominance, it must be manipulated with that dominance. Otherwise, a shift can occur in the representation of a fimne, as shown in Fig.5 : the two first fiames are Fl dominant, but the third one is F2 donunant and composed of two fields which originally did not belong to &e same fiame. In such a case, encoding is less efBcient: a scene cut between the two fields.of an encoded fiame costs a lot in terms of bitrate allocation efficiency. Moreover, F2 dominance may lead to azmoying vertical moving of pictures when a DVD player outputs frames in slow motion or still image mode.

SUMMARY OF THE INVENTION
It is therefore an object of the invention to propose an encoding method in which the above-indicated drawbacks are avoided and the picture quality of any encoded video programme is increased
To this end, the invention relates to a method such as desoibed in the introductory pararaph of the description and in wich the encoding step is preceded by a preprocessing step which comprises the sub-steps of:
(A) receiving the successive fr:ames and delaying Ihem with at least a ""two fields" duration delay;
(B) adjusting said delay according to the following dominance change ciitenon:

(a) when a change from an FI dominance to an F2 dominance is detected, the first field of the first F2 dominant fiame is suppressed, said delay being therefore decreased by a quantity equal to "one field" duration;
(b) when a change fix>m an F2 dominance to an Fl dominance is detected, the last field of the last F2 dominant fi-ame is repeated, the delay being therefore further increased by a quantity equal to "one" field "duratioa
The method thus proposed allows to detect the changes in field dominance and to correct the input sequencing so that the frames can now be encoded correctly.
In an improved embodiment of the invention, in wich the sequence of fiames is constituted either by fihn-type images, to which &e 3:2 pull-down tedmique has bea applied, or by video-type images consistuig of two fields, said method comprises the steps of:
(A) detecting that the current sequence is constituted by film-type images;
(B) encoding said current sequence, either after said preprocessing step when it is not detected as bdng of film*type or after implementation, on said current sequence, of the inverse 3:2 puU-down tedanique if it is detected as being of film-type;
and said detecting step compnies the sub-steps of:
(a) defining for two successive fields F(n) and F(n+2) of the same parity a number of pixels N2 such as N2 =NTOT -N'2, where NTOT is the number of pixels in a field, N2 is the number of pixels for which ABS (val F(n) - val F(n+2)) (b) coxcparing the result of the subtraction of two consecutive mmibers N2, divided by NTOT, to a second predefined threshold THR;

(c) detecting that the current sequoice is constituted by film-type images only when said result is lower than said second threshold, said fields being then considered as equal
It is also an object of the invention to propose a corresponding encoding device.
To^this end, the invention relates to a device for encoding video signals corresponding to a sequence of fismes each of whidi originally consists of two fields Fl and F2, said sequence being constituted either by film-type images, to wich the 3:2 pull-down technique has been applied, or by video-type images consisting of two fields, said device comprising:
(A) means for detecting in the input sequence of fismes a sequence of film-type images;
(B) means for receiving the successive firames of the input sequence, delaying each of them with a delay of at least two fields, and adjusting said delay according to ibe following dominance charge criterion:

(a) when a change from an Fl dominance to an F2 dominance is detected, the first field of the first F2 dominant frame is suppressed, said delay being therefore decreased by a quantity equal to "one field" duration;
(b) when a change from an F2 dominance to an Fl dominance is detected, the last field of the last F2 dominant fiame is repeated, the delay being therefore increased by a quantity equal to "one field" duration.
(Q means for encoding the input sequence of fiames, either connected in series
with means (B) when said sequence is not detected as being of film-type or after implementation of tiie inverse 3:2 pull-down technique if it is detected as being of film-type.
BRIEF DESCRIPTION OF THE DRAWINGS
Hie particularities of the invention will now be explained in a more detailed manner, with reference to the accompanying drawings in which:
-Fig.l shows, at a rate given by the associated synchronization signal on the time axis, a video sequence constituted by successive pairs of fields;
-Fig2 shows the successive fiames Fl, F2 of a preexisting video material, Figs3 and 4 illustrate the structure of Fl dominant and F2 dominant video material.

and,Fig.5 illustrates the case of a video sequence in -winch a shift in the representation of the frames has occurred;
-Fig.6 shows an embodiment of a preprocessing device according to &e
invention;
-Fig.7 illustrates the mechanism according to which the sequence is modified by suppression or repetition of a field, in relation with the type of dominance detection carried out in the preprocessing device;
- Fig.8 illustrates the 3:2 pull-down tedmique -winch, allows to construct a sequence of five interlaced fianes, or pairs of fields F(n) to F(n+9), with n=1 in the present case, fit)m four original sequential fi^imes;
- Fig.9 shows how fields are sequenced for the film mode format and
r
illustrates the set of tests (identical ? or not ?) to be carried out for the detection of a 3:2 pulldown structure;
- Fig.lO shows an encoding sysbsm in vjinch the method according to the invention is inq)lemented;
- Fig.l 1 is an implementation of a preprocessing device comprised in the encoding device of Fig. 10.
DETAILED DESCRIPTION OF THE INVENTION
An exaniple of implementation of a preprocessing device accordii^ to the invention (before coding m a coding device 1003) is illustrated in Fig.6, in the case the input video stream is a sequence composed of ii}formation corresponding to images of the video type, i.e. conq)osed (as already shown in Fig.l) of successive pairs of fiames F(l), F(2),..., F(i),*.. and so on.
Such a sequence is assumed to be Fl dominant, which corresponds in Fig.6 to the upper position of a switch 61; each successive input field IF is tiien delayed in a memory 63, with a delay of two fields, or at least two fields (this delay is illustrated in line (b) of Fig.7 for firames 1 to 3, by a comparison with the corresponding frames of the line (a)). When a diange from "Fl dominant" to "F2 dominant" is detected by means of a circuit 64 for the detection of a field dominance change (instant tl2 in line (a) of Fig.7), the switch 61, controlled by tiiis circuit 64, comes back to its lower position (see Fig.6), for vMch each successive ixspnt field IF is now delayed in a memory 65, with a delay of only one field (or one field less, in tiie case of a greater delay for the memory 63). The first frame wxdi F2 dominance is si^ypressed, and all the subsequent input fields are now delivered with only a

"one field" duration delay (see Hit frames 4 and 5 in line (b) of Fig.7), so that no gap occurs in the output sequence.
When a fcacQier change from "F2 dominant" to "Fl dominant" is detected by the circuit 64 (instant t21 in line (a) of fig.7), the last field Fl of the last F2 dominant frame is repeated in order to retrieve a correct sequencing: all the subsequent input fields are now, as initiallys delivered again with a "two ficlds"_duration delay (see the fiames 6 and 7 in line ^) of Fig.7), or one field more in the case of a greater delay for the memory 63,
The detection of dominance in tbe field dominance diange detection circuit 64 is for instance made tfarouglh ibs use of a scene cut detection method, carried out between consecutive fields. Such a method is described for exanqple in documents such as *liierard3ical scene change detection in an MPEG-2 cominressed video sequence", by T.Shin and al.. Proceedings of the 1998 IEEE ISCAS, May 31,1998, Monterey, Ca^ USA, pp.XV-253 to IV-256, or "A unified approadi to shot change detection and camera motion characterization", by P. Bouthemy and al., IEEE Transactions on Circuits and S3^stems for Video Technology, voL9, n^7, October 1999, pp.1030-1044.
An improved onbodimrat of tiie invention may also be proposed in the following case. In the NTSC standard, the picture frequoicy is 30 intnlaced fismes per second. Howev^, for movies, the frames are produced at a firame rate of 24 Hz. When it is required to visualize a sequence of film-type images on television, it is therefore necessaiy to convert the movie's frame rate to the NTSC standard. The techniqtie currentiy used, which is known as "3:2 pull-down" and is described for instance in the international patent application WO 97/39577, consists of creating five interlaced fi:ames (^^ch can be thoefore visualized on television) based on four original sequential film frames. This is obtained by dividing each of these four sequential fi-ames by two, so as to form four odd and four even fields and by diq>licating two of tiiese ei^ fields.
As illustrated in Fig,8, which shows a film sequence at 24 Hz on the first line and illustrates on the second line how to organize the field sequencing of a corresponding video sequence at 30 Hz, it means that an additional field is inserted for each pair of film frames, for instance by splitting one film fi:ame out of two into three fields, the other one being split as usually into two fields. In the case of the firame split into three fields (for instance, G1G2 split into Fl, F2, F3, or G5G6 split into F6, F7, F8), tiie tiiird one is obtained by copyii^ the odd (Fl) or the even field (F6) alternately, in order to keep tiie sequencing "odd/even". The result is tiie following :

Fl «F3 = G1
F2 =G2 .
F4 =G4
F5 =G3
F6 =F8 = G6
F7 =G5
F9 =G7
F10 = G8,andsoon. These two additional fields obtained by diq>Iication constitute a redundant information. Wh^i encoding such sequences according to the MPEG-2 standard, it is interestuig to d^ect said infonnation: the siqypression of these repeated fields will then free some space to better encode tiie others, tiie concerned MPEG-2 encoder tiius receiving video-type image sequences at 30 Hz and original film-type image sequences at 24 Hz.
An usual criterion to detect automatically sequences coming from movies (film-type image sequences) is therefore tiie following: a stracture of five fi:ames - Le. often fields - is analyzed by means of a subtraction of consecutive fields of the same parity. The condition to detect the 3:2 pull-down structure is the following:
F1 = F3
F2 9£F4
F3 9^F5
F4^F6
FS^tF?
F6 = F8
F7 5tF9
F8 ^ FIO, which is illustrated in the sequence of Fig.9, where fl, f2,... designate the successive fi^mes, lo-le, lo-2e, 2o-3e,... the corresponding pairs of fields, y tiie reply "yes" to the test of comparison ^e. fields equal), and n the r^ly "no" G^.e. fields different). If all these conditions are satisfied, then the inverse 3:2 puU-down conversion is performed on a groiq) of five frames; on the contrary, if one of these conditions is not valid, the encoder goes back to the video mode (no elimination of two fields).
However, due to the possible presence of noise on the original 3:2 pull-down sequence, the equality criterion between two fields (Fl, F3 and F6, F8) may be not strictiy verified. Two fields of the same parity F(N) and F(N+2) are considered If NTOT designates

the total number of pixels in a field (172800 for a full resolution), val (F(N)) designates the luminance value for a given'pixel, Nl is the number of picture elements (pixels) sudi as ABS[val(F(N)) - val (F(N+2))] > THRESl, Nm is the number of pixels such as ABS [val(F(N)) - val (F(N+2))] IF ((Nl ELSE : F(N) i^ F(N+2) The first critaion (Nl Troubles within the film mode detection step may consequently occur mostly in the case of the two following contrasted situations. For static or quasi-static sequences, the dissimilarity criterion is no more verified, since the fields are nearly all equal, and may be tiierefore suppressed, the residual conditions needed to be fiilfiUed being then only Fl = F3 and F6 = F8. But, for a very noisy sequence, wi& v^ch two identical fields may howevCT seem unlike, the threshold setting the likeness oiterion cannot be too increased, oth^-wise fields tibat are different could be considered as identical. The criterion for detecting automatically sequences coming fix)m movies may then be modified on the basis of the following remark. By looking at the N2 statistics (N2 has been defined hereinabove), the ^plicant has noticed that N2 for fields Fl and F3 (referenced N2[l,3]) and N2 for fields F6 and F8 (referenced N2[6,S]) are small compared to the others (more generally, N2[ij] stands for statistics of N2 calculated for Fj-Fi). Then, by computing the difference between two consecutive N2 statistics, for instance: N2[6,8] - N2[5,7], and comparing - in the fomi of a percentage - such a difference to a predetermined threshold (according to an expression of the following form : N2[5,7]-N2[6,8] x 100/NTOT for example), a large value of percentage \s obtained every five computations. Therefore, if the computed p^t^entage is less than X %, with for instance X = 30 %, then both fields (of the last considered pair of fields) are considered as equal, and the inverse 3:2 pull-down processing is carried out for the next five fi^mes.
An encoding systan in \^ch this prq>rocessing opo^on is included is described with reference to Fig.lO. This encodiz^ system comprises means 101 for encodii^ irq)Ut signals corresponding to a sequence either coming from, movies or of \ideo type, means

1U2 tor aeiectmg m saia vapm signals a sequence ot turn type (said detecting means being a detecting stage activated as explained later), and means 103 for switching, only wiien such a detection has occuired, &om a first to a second mode of operation of the encoding means 101. The encoding means 101 comprise a first preprocessing device 1011, a second preprocessing device 1012, and a coding device 1013, for instance an MPEG-2 coder.
The detecting stage, illustrated in Fig.l 1, itself comprise a set of subtracters 141.1,141^ 1413,..., provided for recdying each one two successive fields of tiie same parity and determinix^ per pixel the di£ference between these fields, followed by a set of circuits 142.1,142.2,142.3,... provided for taking the absolute value of said difference; tiiis value is stored in a memory, 143.1,143.2,143 J,..., respectively. The successive differences between tiie successives values of these stored absolute values are then computed in subtractors 144.1,144^, 144.3,..-, and these differences, for instance multiplied by 100/NTOT as indicated above, are compared to the predefined threshold (tests CI). If the fields are equal, i.e. they correspond to film-type images (in the pres«it case, for Fl = F3 and for F6 » F8), an inverse 3:2 puU-down processing can be carried out for the next five flames, in Ifae first preprocessii^ device 1011; this situation corresponds to the lower position of the switching means 103. When it is not tiie case (video-type images), the switching means 103 are in the opposite position (uppa position). The device 1011 is then de-activated, and m the same time the second preprocessing device 1012 becomes active (tiiis device 1012 has exacdy the same structure as ihc preprocessing device of Fig,6).
An encoding system corresponding to this last description may be used for transmitting animated images with television systems opiating at a frequency of 60 h^tz (for instance witii the NTSC standard used in countries such as Japan or the United States of America).




WE CLAIM :
1. A device for encoding video signals corresponding to a sequence of frames each of which
originally consists of two fields Fl and F2, said sequence being constituted either by film-type
images, to which the 3:2 pull-down technique has been applied, or by video-type images consisting
of two fields, said device comprising :
(A) means for detecting in the input sequence of frames a sequence of film-type images;
(B) means for receiving the successive frames of the input sequence, delaying (63) each of them with a delay of at least two fields, and adjusting said delay according to the following dominance charge criterion :

(a) when a change from an Fl dominance to an F2 dominance is detected (64), the first field of the first F2 dominant frame is suppressed, said delay being therefore decreased by a quantity equal to "one field" duration ;
(b) when a change from an F2 dominance to an Fl dominance is detected (64), the last field of the last F2 dominant frame is repeated, the delay being therefore increased by a quantity equal to "one field" duration.
(C) means (1003)for encoding the input sequence of frames, either connected in
series with means (B) when said sequence is not detected as being of film-type or after
implementation of the inverse 3:2 pull-down technique if it is detected as being of film-
type.
2. The device as claimed in claim 1, wherein said detecting means comprises
(a) a set of subtractors (141.1, 141.2,...), provided for receiving each one two successive fields of the same parity and determining per pixel the difference between these fields;
(b) a set of circuits (142.1, 142.2,,..) provided for taking the absolute value of said difference;

(c) memories (143.1, 143.2,...), provided for storing said absolute values ;
(d) subtracters (144.1, 144.2,...), provided for computing the successive differences
between the successives values of these stored absolute values, comparing these differences to a
predefined threshold, and detecting a sequence of film-type only when said difference is lower than
a predefined threshold, said fields being then considered as equal.
3. The method for encoding video signals corresponding to a sequence of frames each of which
originally consists of two fields Fl and F2, said encoding method comprising a preprocessing step,
applied to the successive frames received, and an encoding step, applied to the preprocessed frames,
said preprocessing step itself comprising the sub-steps of:
(A) receiving the successive frames and delaymg each of them with a delay of at least two fields;
(B) adjusting said delay according to the following dominance change criterion :

(a) when a change firom an Fl dominance to an F2 dominance is detected, the first field of the first F2 dominant frame is suppressed, said delay being therefore decreased by a quantity equal to "one field" duration ;
(b) when a change from an F2 dominance to an Fl dominance is detected, the last field of the last F2 dominant frame is repeated, the delay being therefore increased by a quantity equal to "one field" duration.
4. The method as claimed in claim 3, wherein said sequence of fi-ames being constituted either
by film-type images, to which the 3:2 pull-down technique has been applied, or by video-type
images consisting of two fields, said method comprising the steps of:
(A) detecting that the current sequence is constituted by film-type images ;
(B) encoding said current sequence, either after said preprocessing step when it is not detected as being of film-type or after implementation, on said current sequence, of the

inverse 3:2 pull-down technique if it is detected as being of film-type ; and said detecting step comprising the sub-steps of:
(a) defining for two successive fields F(n) and F(n-+-2) of the same parity a number of pixels N2 such as N2 = NTOT - N'2, where NTOT is the number of pixels in a field, N'2 is the number of pixels for which ABS (val F(n) - val F(n+2)) (b) comparing the result of the subtraction of two consecutive numbers N2, divided by NTOT, to a second predefined threshold THR ;
(c) detecting that the current sequence is constituted by film-type images only when said result is lower than said second threshold, said fields being then considered as equal


Documents:

in-pct-2001-471-che-abstract.pdf

in-pct-2001-471-che-claims filed.pdf

in-pct-2001-471-che-claims granted.pdf

in-pct-2001-471-che-correspondnece-others.pdf

in-pct-2001-471-che-correspondnece-po.pdf

in-pct-2001-471-che-description(complete) filed.pdf

in-pct-2001-471-che-description(complete) granted.pdf

in-pct-2001-471-che-drawings.pdf

in-pct-2001-471-che-form 1.pdf

in-pct-2001-471-che-form 19.pdf

in-pct-2001-471-che-form 26.pdf

in-pct-2001-471-che-form 3.pdf

in-pct-2001-471-che-form 5.pdf

in-pct-2001-471-che-pct.pdf


Patent Number 209373
Indian Patent Application Number IN/PCT/2001/471/CHE
PG Journal Number 38/2007
Publication Date 21-Sep-2007
Grant Date 28-Aug-2007
Date of Filing 02-Apr-2001
Name of Patentee M/S. KONINKLIJKE PHILIPS ELECTRONICS N.V
Applicant Address Groenewoudseweg 1, NL-5621 BA Eindhoven
Inventors:
# Inventor's Name Inventor's Address
1 DEL CORSO Sandra Prof. Holstlaan 6 NL-5656 AA Eindhoven
2 GAUTIER Pierre Prof. Holstlaan 6 NL-5656 AA Eindhoven
3 LE MAGUET Isabelle Prof. Holstlaan 6 NL-5656 AA Eindhoven
PCT International Classification Number H04N 7/26
PCT International Application Number PCT/EP2000/007425
PCT International Filing date 2000-07-31
PCT Conventions:
# PCT Application Number Date of Convention Priority Country
1 99401969.3 1999-08-03 EUROPEAN UNION
2 99403228.2 1999-12-21 EUROPEAN UNION