Source Coding and Compression

Содержание

Слайд 2

Definition 1.1: The fundamental problem of communication is that of reproducing

Definition 1.1: The fundamental problem of communication is that of reproducing

at one point either exactly or approximately a message selected at another point. (Claude Shannon, 1948)

SmOkInG CaN KiLL YoU!

Information Theory

Why Shannon used the word reproducing and not receiving? What is meant by exactly and approximately

Слайд 3

“THEN, OUR PROBLEM IS THE NOISE!” Information Theory How can we

“THEN, OUR PROBLEM IS THE NOISE!”

Information Theory

How can we achieve perfect

communication over an imperfect, noisy communication channel?

Example: an analogue telephone line, over which two modems communicate digital Information. modem → line+noise → modem

Definition 1.1: The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. (Claude Shannon, 1948)

Слайд 4

The theory provides answers to two fundamental questions (among others): What

The theory provides answers to two fundamental questions (among others):
What

is the irreducible complexity below which a signal cannot be compressed? (Shannon Theorem 1)
How Can we correct errors or know their locations (Shannon Theorem 2)
What is the ultimate transmission rate for reliable communication over a noisy channel? (Shannon Theorem 3)
How can we protect them against hacking ? (Shannon Security Theorem)

Shannon's Information Theory

Слайд 5

The information theory (IT) frame work: Communication System Line Coding Channel

The information theory (IT) frame work:
Communication System
Line Coding
Channel Coding (Shannon's 2nd

theorem)
Cryptology (Shannon's security theory)
Source Coding (Shannon's 1st theorem)

Information Theory ?

Слайд 6

The information theory (IT) frame work: Communication System Line Coding Channel

The information theory (IT) frame work:
Communication System
Line Coding
Channel Coding (Shannon's 2nd

theorem)
Cryptology
Source Coding (Shannon's 1st theorem)

Information Theory ?

Слайд 7

Communication System Information Theory ?

Communication System

Information Theory ?

Слайд 8

Communication System Error Free Transmission Erroneous Transmission Information Theory ?

Communication System

Error Free
Transmission

Erroneous
Transmission

 

 

 

 

 

 

 

 

Information Theory ?

Слайд 9

The information theory (IT) frame work: Communication System Line Coding Channel

The information theory (IT) frame work:
Communication System
Line Coding
Channel Coding (Shannon's 2nd

theorem)
Cryptology
Source Coding (Shannon's 1st theorem)

Information Theory ?

Слайд 10

Also known as digital baseband modulation (1,0,1,1,0, … ) encoding digital

Also known as digital baseband modulation (1,0,1,1,0, … )
encoding digital information

to make it resistant to certain forms of signal loss during transmission

Line Coding

Example: Differential Manchester !!! Not returning to zero! Why?
a binary '1' is referred to as a "phase inversion"; a binary '0' is a "constant phase"

Слайд 11

The information theory (IT) frame work: Communication System Line Coding Channel

The information theory (IT) frame work:
Communication System
Line Coding
Channel Coding (Shannon's 2nd

theorem)
Cryptology
Source Coding (Shannon's 1st theorem)

Information Theory ?

Слайд 12

The Three Channel Properties Channels can only transport physical signals, e.g.,

The Three Channel Properties

Channels can only transport physical signals, e.g., electrical

signals. Therefore, digital signals must be converted to appropriate formats (remember the line coding or RF)
EvEn iF the signal is adapted to the channel, it does not pass it undisturbed !! The channel introduce errors
There is always an upper bound to the number of correct bits that you can send over the channel (Shannon Channel Capacity: C)
Слайд 13

A.K.A: Forward error correction (FEC) For controlling errors in data transmission

A.K.A: Forward error correction (FEC)
For controlling errors in data transmission over

unreliable (generally → noisy) communication channels
Tx Encode the message with redundancy
The first error-correcting code in 1950 by Richard Hamming: Hamming (7,4) code
The simplest code is the repetition code!! transmit each bit several times until and try to find the maximum likelihood
Example:
0 0 0 → correct 0
0 0 1 → maybe 0
1 1 1 → correct 1
1 0 1 → maybe 1
1 0 0 → maybe 0

Channel Coding

Слайд 14

Types of Channel Coding Block codes: A codeword with a length

Types of Channel Coding

Block codes:
A codeword with a length N consists

of K information symbols (Ij) and M parity bits symbols
Code rate R = K/N
Convolution Coding:
information bits are transformed to n bits (not injection; rather tapped delay model!!)
Can be decoded using Viterbi algorithm

D0

D1

D2

+

+

Coded bits

Information bits

Iterative decoding:
LDPC (Low-density parity-check code)
Exploit the low “1” counts into a graph (Tanner graph), then decode iteratively (needs very long codes)
Turbo Codes:
Combine two convolution codes, inner code and outer code

Слайд 15

The information theory (IT) frame work: Communication System Line Coding Channel

The information theory (IT) frame work:
Communication System
Line Coding
Channel Coding (Shannon's 2nd

theorem)
Cryptology
Source Coding (Shannon's 1st theorem)

Information Theory ?

Слайд 16

Cryptology or the “Hidden/Secret Study” Cryptography mechanism

Cryptology or the “Hidden/Secret Study”

Cryptography mechanism

Слайд 17

Cryptology or the “Hidden/Secret Study” Alice encrypts her data using certain

Cryptology or the “Hidden/Secret Study”

Alice encrypts her data using certain security

parameters.
Then, the encrypted data is transmitted to Bob, who decrypts it using synchronization.
An eavesdropper, Eve, can hack the communication channel and decrypt the encrypted
data, if she manages to identify Alice’s security parameters.

Description of security in communication scheme

Слайд 18

Cryptology World War II The German Lorenz cipher SZ42 (SZ for

Cryptology World War II

The German Lorenz cipher SZ42 (SZ for Schlüsselzusatz),

one of the first security “machines” in the world … see, no software ;-)
Слайд 19

The information theory (IT) frame work: Communication System Line Coding Channel

The information theory (IT) frame work:
Communication System
Line Coding
Channel Coding (Shannon's 2nd

theorem)
Cryptology
Source Coding (Shannon's 1st theorem)

Information Theory ?

Слайд 20

a.k.a.: The Noiseless Coding Theorem, Bit-rate (Data) Reduction, and Data Compression

a.k.a.: The Noiseless Coding Theorem, Bit-rate (Data) Reduction, and Data Compression

The

Source Coding Theorem

Encoder

Decoder

y

x

x'

Original

Compressed

decompressed

Lossless compression: x = x’
(a.k.a. entropy coding, reversible coding)
Applications: text compression
“Do not send money” → “Do now send money”
Lossy compression: x ≠ x’
(a.k.a. irreversible coding)
Applications: image compression
Keywords: distortion vs. perception

Слайд 21

The Source Coding Theorem High comp. (98% less info) 1.14 KB)

The Source Coding Theorem

High comp. (98% less info) 1.14 KB)

Original (108.5

KB)

Medium comp. (92% less info), 4.82 KB

Low comp. (84% less info) 9.37 KB

Слайд 22

Compression Measurement Criteria Compression ratio:|x|/|y| |x| represents the number of bits

Compression Measurement Criteria

Compression ratio:|x|/|y|
|x| represents the number of bits in y
E.g.:

|x| = 65,536, |y| = 16384, compression = 4:1
Alt., data has been reduced by (|x|-|y|)/|y| = 75%
Other measures of source coding performance
Bits per sample
E.g. ASCII: 8 bits/char, RGB: 24/48/72 bits/pixel
Distortion (Only on lossy methods)
Depends on the human-perceived and/or mathematical difference between x and x’
Слайд 23

Now we will learn 3 concepts: Self-information Entropy Kraft's inequality The Source Coding Theorem

Now we will learn 3 concepts:
Self-information
Entropy
Kraft's inequality

The Source Coding Theorem

Слайд 24

Real-World Coding: Morse(1844)

Real-World Coding: Morse(1844)

Слайд 25

Real-World Coding: Morse(1844) Generally, high/low frequency => short/long codes!! - Not

Real-World Coding: Morse(1844)

Generally, high/low frequency => short/long codes!!
- Not 100% consistent!
Example:

E vs. T and I vs. M
Слайд 26

Assume: A source with finite number of symbols S ={s1, s2,

Assume:
A source with finite number of symbols S ={s1, s2, ...,

sN}
Symbol sn has a probability P(sn)= pn
Lets assume that for every symbol sn, there is a ln different bits (bits/symbol)
Before we do any further progress, lets define the term: self-information that is:

The Source Coding Theorem - I

 

Theorem 1.1: (Self Information)
The self information is measured by:

Слайд 27

Properties of self-information: barking of a dog during breaking a house

Properties of self-information:
barking of a dog during breaking a house carry

or does not carry too much info if
If he only bark in dangerous cases (info)
If he is barking all the time (no-info)
Independent events have accumulated self-information, i.e.,

Self-Information - I
to calculate the information in bits we need to take the logarithm with the base (b in the log base) = 2 and this mean:
for b = e (2.7183), information is measured in “nat” and for b = 10 is in dit (decimal digits)

Слайд 28

Self-Information Examples Example 1: the out come of flipping a coin

Self-Information Examples
Example 1: the out come of flipping a coin if:
The

coin is fair, i.e.,
then:
The coin tossing is not fair; some one cheating towards “T” ;-) :
and
Then: (No Surprise)
If we have a set of independent events, , then the average self-information associated with the experiment is

 

 

 

 

 

 

 

This is the entropy (H)

Слайд 29

Entropy as Information Content Entropy is a probabilistic model such that:

Entropy as Information Content

Entropy is a probabilistic model such that:
Independent fair

coin flips have an entropy of 1 bit per flip. A source that always generates a long string of B's has an entropy of 0, since the next character will always be a 'B'.

Shannon showed that:
Definition 1.2: If the experiments is a source that puts out symbols sn from a set A, then the entropy is a measure of the average number of binary symbols (bits) needed for encoding the source

Avg. message size in bits

Слайд 30

The Source Coding Theorem - II Theorem 1.2: (Entropy) The minimum

The Source Coding Theorem - II

Theorem 1.2: (Entropy)
The minimum average length

of a codeword is:

Entropy is the minimum expected average length.
If the average length decreased than H, then the code will not be decoded

Слайд 31

The Source Coding Theorem - III Code Word and Code Length:

The Source Coding Theorem - III

Code Word and Code Length:
Definition 1.2:

A (binary) source code C for a random variable S is a mapping from S to a (finite) binary string (string of bits). Let C(sn) be the codeword corresponding to sn and let l(sn) denote the length of C(sn).
To proceed, let us focus on codes that are “instantaneously” decoded, e.g., Prefix Code
Definition 1.3: A code is called a prefix code or an instantaneous code if no codeword is a prefix of any other codeword.
Example:
Non-prefix: {s1=0, s2=01, s3=011, s4=0111}
Prefix: {s1=0, s2=10, s3=110, s4=111 }
Слайд 32

The Source Coding Theorem – IV Can you say why?

The Source Coding Theorem – IV

Can you say why?

Слайд 33

The Source Coding Theorem - V Conversely, given a set of

The Source Coding Theorem - V

Conversely, given a set of codeword

lengths satisfying the above Inequality,
one can have instantaneous code with these word lengths!! (see only can!)
Слайд 34

The Source Coding Theorem – IV Do It Your Self!!

The Source Coding Theorem – IV

Do It Your Self!!

Слайд 35

Notation for Sequences & Codes Assume a sequence of symbols X

Notation for Sequences & Codes

Assume a sequence of symbols X =

{X1;X2; … ;Xn} from a finite source alphabet AX
We almost always use A = {0, 1} (e.g. computer files, digital, communication) but the theory can be generalized to any finite set.
Encoder: outputs a new sequence Y = {Y1;Y2; … ;Yn}, (using a possibly different code alphabet AY).
A symbol code, C, is a mapping of AX → AY ; We use c(x) to denote the codeword to which C maps x.

Encoder

Decoder

y

x

x'

Original

Compressed

decompressed

Слайд 36

Lossless Data Compression Let's focus on the lossless data compression problem

Lossless Data Compression

Let's focus on the lossless data compression problem for

now, and not worry about noisy channel coding for now. In practice these two problems (noise and compression) are handled separately.
An efficient code for the source data is able to (remove source redundancy)
Then (if necessary) we design a channel code to help us transmit the source code over the channel (adding redundancy) almost error free.
Assumptions (for now):
the channel is perfectly noiseless, i.e., the receiver sees exactly the encoder's output
we always require the output of the decoder to exactly match the original sequence X.
X is generated according to a fixed probabilistic model, p(X), i.e., which is known!!
We will measure the quality of our compression scheme by examining the average length of the encoded string Y, i.e., over p(X).

Encoder

Decoder

y

x

x'

Original

Compressed

decompressed