|Title of Invention||
METHOD OF DATA ENCODING FOR COMPRESSION FOR ALL COMPRESSION SCHEMES HAVING EMBEDDED ESCAPE TAGS
|Abstract||The present invention proposes a method of storing the compressed data for compression algorithm which will reduce the decoding time In the present method the encoded data is divided into two parts containing a header and the uncompressed part of data. The improvement in the decoding time is got by: • Removing the comparison for every uncompressed byte; • Block copying of the uncompressed data especially for embedded devices.|
|Full Text||FIELD OF TECHNOLOGY
The present invention generally relates to data compression technique. More particularly this invention relates to a method/technique of data encoding for compression to improve the decode time for all compression schemes having embedded escape tag.
PRESENT STATE OF ART
To be specific in data compression methods, Run Length Encoding LZ77 and LZW encoding are used. The details of these encoding methods are given below:
Run Length Encoding (RLE)
Run-length encoding is a very simple form of data compression encoding. It is based on the simple principle of encoding data. This principle is applied to every stream which is formed of the same data values (repeating values is called a run) i.e. sequence of repeated data values is replaced with count number and a single value. This intuitive principle works best on certain data types in which sequences of repeated data values can be noticed; RLE is usually applied to the files that a contain large number of consecutive occurrences of the same byte pattern.
RLE may be used on any kind of data regardless of its content, but data which is being compressed by RLE determines how good compression ratio will be achieved. So RLE is used on text files which contains multiple spaces for indention and formatting paragraphs, tables and charts. Digitized signals also consist of unchanged streams so such signals can also be compressed by RLE. RLE is a lossless type of compression and cannot achieve great compression ratios, but a good point of that compression is that it can be easily implemented and quickly executed. This has been shown in Figure 4. (http://www.rasip.fer.hr/research/compress/alqorithms/fund/rl/)
The algorithm searches the window for the longest match with the beginning of the look ahead buffer and outputs a pointer to that match. Since it is possible that not even, one-character match can be found, the output cannot contain just pointers. LZ77 solves this problem this way: after each pointer it outputs the first character in the lookahead buffer after the match. If there is no match, it outputs a null-pointer and the character at the coding position. (http://www.rasip.fer.hr/research/compress/alqorithms/fund/lz/lz77.html)
LZW is an algorithm based on the idea of a dynamic vocabulary, shorthand for
frequently repeated patterns. It is one of the tougher implementations along with
arithmetic. This algorithm also by its nature is readily recompressed by a huffman
or simple code table optimization.
Tag Based Encoding
Tag based encoding is a type of encoding where stream of input data is replaced by a special symbol followed by some data which is related to the replaced input data stream. By doing this size of the input data is reduced, thus achieving compression. Figure 1 shows a block representation of the encoding process, whereas
I: Input data stream
O: Output data stream = E( I)
Size of (O) Time (E (I)) and Time (D (O)) is constrained by its usage in real time. For non-real time application, time constraints are not so important to have better compression ratio.
structure of Encoded Data
Operation of TAG based encoding & decoding
IF DATA_COULD_BE_COMPRESSED THEN BEGIN
WRITE TAG, CODE in to Output Buffer
Skip index of input by input number of bytes processed END ELSE BEGIN
WRITE TAG Indicating tliat n bytes are not compressed
Sl^ip index by n END ENDIF END
Tills Tag Based Encoding has been shown in Figure 2.
IF COMPRESSED THEN
Using TAG, CODE Reproduce the input bytes
Write n number of bytes as it is
This Tag Based Decoding has been shown in Figure 3.
The existing methods as discussed above have the following problems:
1. They store run data along with the characters, which could not be compressed. This will involve a comparison for every byte to verify whether it is compressed or not. This consumes a lot of processor cycles.
2. For all uncompressed bytes an extra overhead of storing is involved (to mark it as not compressed) even though such bytes appear consecutively in the byte stream.
3. Block memory copy / store instruction for uncompressed bytes cannot be used as it is stored byte wise.
OBJECTS OF THE INVENTION
The primary object of this invention is to provide an improved and useful data compression method to improve the decode time for all compression schemes having embedded escape tags.
It is another object of the invention to provide a method for data encoding to improve decode time at the cost of decreased compression.
It is another object of the invention to provide a data compression method which will be useful in devices which need a faster decoding algorithm.
SUMMARY OF THE INVENTION
The present invention relates to a method of data encoding for compression to improve the decode time for all compression schemes having embedded escape tags wherein decoding efficiency is improved in that the block copy/move of encoded data is allowed, the implementations use run length encoding and LZ77, the control data is placed in the header or using an offset within the bit stream and the requirement of the comparison for every uncompressed byte is avoided.
The present invention proposes a method of storing the compressed data for compression algorithm which will reduce the decoding time.
In the present method the encoded data could be divided into two parts containing a header and the uncompressed part of data. The improvement in the decoding time will be got by:
1. Removing the comparison for every uncompressed byte;
2. Block copying of the uncompressed data especially for embedded devices.
Accordingly, the present invention comprises a method of data encoding for compression for all compression schemes having embedded escape tags wherein:
i) encoded data is block copied// moved
ii) run length encoding and LZ77 is implemented;
ill) the control data is placed in the header or using an offset within the bit
stream; and iv) comparison for every uncompressed byte is avoided.
The other objects, features and advantages of the present invention will be apparent from the accompanying drawings and the detailed description as follows.
BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS
Figure 1 represents a block diagram of tag based encoding process. Figure 2 shows a flow chart of the tag based encoding methods.
Figure 3 shows a flow chart of the tag based decoding methods.
Figure 4 shows a flow chart of the Run Length Encoding.
Figure 5 shows the flow chart of the encoding methods of present invention.
Figure 6 shows the flow chart of the decoding methods of present invention,
DETAILED DESCRIPTION OF THE INVENTION
A preferred embodiment of the present invention will now be explained with reference to the accompanying drawings. It should be understood however that the disclosed embodiment is merely exemplary of the invention, which may be embodied in various forms. The following description and drawings are not to be construed as limiting the invention and numerous specific details are described to provide a thorough understanding of the present invention, as the basis for the claims and as a basis for teaching one skilled in the art how to make and/or use the invention. However in certain instances, well-known or conventional details are not described in order not to unnecessarily obscure the present invention in detail.
Structure of the Invention
The encoded data is stored in the following manner, which will improve the
The Header contains the following information:
1. Total number of encodable lengths / blocks found in the byte stream;
2. For every encodable lengths / blocks store the index from where the output has to be written;
3. Encodable lengths / blocks data
4. The length of the encodable lengths / blocks data (1).
Uncompressed data is the byte stream, which could not be compressed. (This could again be compressed in the above mentioned manner, and could be divided into header and Byte stream)
Steps involved in this Invention
Referring to figure 2 which shows the flow chart of tag based encoding methods, the steps involved are as follows:
1) It is checked whether there are any inputs bytes to be compressed;
2) If no more bytes are there, the process stops;
3) If there are bytes to be compressed, it is checked whether it can be compressed;
4) If the byte cannot be compressed, the tag is added indicating that the byte cannot be compressed and data is sent to the output buffer and the process starting from step1 is continued;
5) If the byte can be compressed, then the tag and the encoded data are written in the output buffer and the process starting from step1 is continued.
Referring to figure 3 which shows a flow chart of the tag based decoding
the steps involved are as follows:
1) It is checked whether all the encoded data are decoded;
2) If all the data are decoded then the process is stopped;
3) If all the data is not decoded then it is checked whether the data present in the location is compressed or not;
4) If it is not compressed then the uncompressed byte is read and written into the output buffer and the process starting from step1 is continued;
If the data is compressed then it is uncompressed and the decoded data is written into the output buffer. After this the process starting from step1 is continued;
Referring to figure 4 which shows the flow chart of the Run Length Encoding, the steps followed are as follows:
1) The character count and repetition count are initialized as zero;
2) The character with the count number is read;
3) If the character is end of file(i.e. the last character), then the process is stopped;
4) If the character is not end of file then the character count is increased by one;
5) It is checked whether the character count is one or not;
6) If the character count is one then store that character in temporary location called as temporary character and the process starting from step 2 is continued;
7) If the character count is not equal to one then it is checked whether the character is the same as the previous character by comparing it with the temporary character;
8) If the character Is the same as the previous character then the repetition count is increased by one and the process starting from step 2 is continued;
9) If the character is different from the previous one then it is checked whether the repetition count is greater than / equal to or lesser than 4;
10) If the repetition count is lesser than 4 then the process starting from step 1 is continued as the code is to be as such and it not necessary to modify it;
11) If the repetition count is greater than/ equal to 4 then the compression code word is inserted and the process starting from step 1 is continued.
Referring to figure 5 which shows the flow chart of the encoding methods of
the steps followed in the encoding mechanism are as follows:
1) The total number of matches is initialized to zero;
2) The header is initialized to null;
3) It is checked that whether any input bytes to be compressed are left;
4) If no more bytes are left, then the process stops;
5) If there are bytes left, then it is checked whether they can be compressed;
6) If the bytes can be compressed then the header is updated with the tag and the code, the total number of matches is incremented and the process is continued starting from step3;
7) If the bytes cannot be compressed then the pointer is moved till a match is found or till the end of code stream is reached. Then the data is written in the uncompressed data buffer and the process starting from step 3 is continued.
Referring to figure 6 which shows the flow chart of the decoding methods of
the steps involved in the decoding mechanism are as follows:
1) The header and the uncompressed data stream is read;
2) It Is checked whether all the encoded tags are decoded;
3) If all the encoded tags are decoded then the remaining block of uncompressed data is copied and the process stops;
4) If all the encoded tags are not decoded then the uncompressed data (bytes till the next index where the match is found) is copied, the compressed data (match length number of the matched data) is copied in the output and the process stating from step 2 is continued.
Initialize total number of matches to 0 Initialize header to NULL
Repeat for all the bytes in the input data stream
IF (MATCH_PRESENT) THEN
Update Header with match data Increment total number of matches ELSE
Move the input pointer till a match is found or end of stream is reached
Write data in the uncompressed data buffer ENDIF END END
An encoding mechanism is shown in the Figure 5.
Decoding will be done in the following way:
Read Header and Uncompressed data stream
Repeat for all match
Block copy the bytes till the next index where match is found into the
Block Copy match length number of match data into the output
Block Copy the remaining bytes from the uncompressed data stream END
A decoding mechanism is shown in the Figure 6.
Advantage(s) of the Invention
This method has the following advantages:
1. This method is useful when decoding time is very crucial and needs to be very fast. For example displaying icons / bitmaps on a mobile handset needs to be very fast, this type of decoding performance is not available in most of the available compression algorithms.
2. This method improves the decoding time at the cost of decreased
compression. Loss in compression is very less when compared to the improvement got in decoding time
3. In the above method entire decoding process is completed with block memory copies. This will improve the speed of decoding.
4. There is no extra data for every uncompressed byte in the data stream to identify whether it is compressed or not
It will also be obvious to those skilled in the art that other control methods and apparatuses can be derived from the combinations of the various methods and apparatuses of the present invention as taught by the description and the accompanying drawings and these shall also be considered within the scope of the present invention. Further, description of such combinations and variations is therefore omitted above. It should also be noted that the host for storing the applications include but not limited to a computer, printer or a multi function device.
Although the present invention has been fully described in connection with the preferred embodiments thereof with reference to the accompanying drawings, it is to be noted that various changes and modifications are possible and are apparent to those skilled in the art. Such changes and modifications are to be understood as included within the scope of the present invention as defined by the appended claims unless they depart therefrom.
1. A method of data encoding for compression for all compression schemes
having embedded escape tags wherein:
i) encoded data is block copied//moved;
ii) run length encoding and LZ77 is implemented;
iii) the control data is placed in the header or using an offset within the bit
stream; and iv) comparison for every uncompressed byte is avoided.
2. A method of data encoding for compression as claimed in claim 1, wherein the
computation of compression technique is implemented by providing a tag for
both encoded and decoded data.
3. A method of data encoding for compression for all compression schemes
having embedded escape tags substantially as herein described particularly with
reference to the drawings.
Dated this 20th day of December 2003
|Indian Patent Application Number||1043/CHE/2003|
|PG Journal Number||08/2007|
|Date of Filing||23-Dec-2003|
|Name of Patentee||M/S. SAMSUNG INDIA SOFTWARE OPERATIONS PVT. LTD|
|Applicant Address||SAMSUNG ELECTRONICS COMPANY LIMITED, KOREA J.P. TECHNO PARK, 3/1, MILLERS ROAD, BANGALORE 560 052, KARNATAKA, INDIA|
|PCT International Classification Number||H03M7/00|
|PCT International Application Number||N/A|
|PCT International Filing date|