Normalisation, L1, L2 Norms

neovijayk
Jul 6, 2020
3 min read

In this article I am going to gather answers for some important questions regarding the Normalization. I have tried to find the answers for these questions from different sources, websites. These questions and answers will surely will help you to know more about Normalization in Machine learning.

Before moving forward we will take a look at basic terms and definitions:

Now we will take a look at normalization.

Normalisation:

When we need to perform the Normalisation?

For machine learning, every data-set does not require normalisation. It is required only when features have different ranges.

Normalization function in Python:

Using sklearn package in Python we can Normalize our features either using L1 and L2 methods.

L1 Vs L2? Which one to use?

L1 norm

Picture 1: L1 norm

Is also known as least absolute deviations (LAD), least absolute errors (LAE)
It is basically minimizing the sum of the absolute differences (S) between the target value (Yi) and the estimated values (f(xi)): as shown in picture 1
On another words Sum of absolute values = 1
Example if applied this norm along row then sum of square for a row = 1.
It is insensitive to outliers
Sparsity:
Refers to that only very few entries in a matrix (or vector) is non-zero.
L1-norm has the property of producing many coefficients with zero values or very small values with few large coefficients.

Picture 2: L2 norm

L2 norm

Is also known as least squares
Sum of squares = 1
Example if applied this norm along row then sum of square for a row = 1.
takes outliers in consideration during training:
it is resistant to outliers in the data.
Computational efficiency:
L1-norm does not have an analytical solution, but L2-norm does.
This allows the L2-norm solutions to be calculated computationally efficiently.
However, L1-norm solutions does have the sparsity properties which allows it to be used along with sparse algorithms, which makes the calculation more computationally efficient.

(Source 1, Source 2)

Example using L1 normalization along column:

First we will look at the Output of normalize() using L 1 Norm along axis = 0

#Input array

npSample = np.array([[1,11],\

[0,12]])

result = preprocessing.normalize(npSample, norm='l1', axis = 0) # axis = 0 along the column.

result #printing the result

print("After normalization Addition of 1st Column:", int(result[:, 0].sum()))

print("After normalization Addition of 2nd Column:", int(result[:, 1].sum())) # answers should be = 1 (since scaling in 0-1)

Example using L2 normalization along column:

First we will look at the Output of normalize() using L2 Norm along the axis = 0

#Input array

npSample = np.array([[1,11],\

[0,12]])

result = preprocessing.normalize(npSample, norm='l2', axis = 0)

result

print("After normalization Addition of 1st Column:", int(result[:, 0].sum()))

print("After normalization Addition of 2nd Column:", int(result[:, 1].sum())) # answers should be = 1 (since scaling in 0-1)

Normalization along columns & rows:

Which one to use normalization using axis = 0 (along the column) or axis = 1 (along the row)?

By default its normalized along the rows.
But it depends upon the input features you are using. For me I am using axis = 0 (along the rows) in case of Logistic regression when input was different features with different range of values.

First we will look at the Output of normalize() using axis = 0 along the column

#Input array

npSample = np.array([[1,11],\

[0,12]])

result = preprocessing.normalize(npSample, norm='l2', axis = 0) # axis = 0 along the column.

result #printing the result

print("After normalization Addition of 1st Column:", int(result[:, 0].sum()))

print("After normalization Addition of 2nd Column:", int(result[:, 1].sum())) # answers should be = 1 (since scaling in 0-1)

Note that after normalization the summation along Columns is 1

Now we will look at the output of normalize() using axis = 1 along the row:

result = preprocessing.normalize(npSample, norm='l2', axis = 1) # along the rows.

result

print("After normalization Addition of 1st row:", int(result[0, :].sum()))

print("After normalization Addition of 2nd row:", int(result[1, :].sum()))

Note that after normalization the summation along rows is 1

That is it. If you have any questions feel free to ask in the comment section. To refer the code explained above please refer this GitHub repository.

If you like this article please click like button and subscribe to blog. 🙂

Normalisation, L1, L2 Norms

Normalisation:

Normalization function in Python:

Example using L1 normalization along column:

Example using L2 normalization along column:

Normalization along columns & rows:

Recent Posts

コメント

Subscribe to BrainStorm newsletter