Normalization Assignment Help 


In the field of statistics and applied normalization can have multiple meanings. The word “normalization” is an informal term for normalized data. The method of normalizing data is used in various cases: some simple and some complicated. In the easiest of cases, to enable simple comparison of variables or values measured on different scales, the normalizing data process eliminates the units of measurements and adjusts it to a notionally common scale. This is often done prior to averaging.


For example: Normalization of data can be used to compare the weather between New York in United States and London in United Kingdom, in the case when London is at 10 degrees Celsius and New York is at a 68 degrees Fahrenheit. Since the United States officially uses Fahrenheit and the United Kingdom uses Celsius to measure the temperature, it becomes difficult to allow comparison. This is when we use Normalization to reduce the measurements into a neutral or standard scale to compare and gauge which place is colder.

There are various ways to Normalize data, depending on the type of measurement scales given below:


Nominal Scale

1) Similar to the prior example, if different scales were used to measure two nominal variables, to enable comparison, it becomes necessary to reduce the measurements to a common scale. This helps us understand how they equate with each other. This is called Normalization. There is no basic or standard way to do this, but can be done via these approaches:

  1. To compare observations of one variable with the other, a contingency table can be built. Hence, if they fall into the same category, we can initiate a one on one mapping of both the scales i.e., in variable 1, “4” compares to a “86” in variable 2.  
  2. Another way to is to use an established technique called “correspondence analysis” which simplifies the process in finding out scores of maximizing correspondence.


Ordinal Scale

Variable
V1
V2
V3
i
44
0
180
ii
44
0
180
iii
44
0
180
D
43
2
215
E
46
134
223


1) A way to standardize this type of scale is : for example as you can see, values are transformed to orders of rank yielding:   

Variable
V1*
V2*
V3*
I
1
1
1
Ii
1
1
1
Iii
1
1
1
Iv
2
2
2
v
3
3
3


2) Here the Normalized data reveals helps to understand whether the two variables are even measuring the same thing by eliminating the unit, in this case: numerical differences, which eventually compares the latent values.

3) The * denotes the Normalized adaptation of the variable. 


Interval Scale

1) This types of scale corresponds to Linear transformation which depicts that when measuring a set of variables with this scale, any multiplication/ addition will still reveal values that are equally as valid. This happens because the linear transformation does not change the ratios between the numbers depicted in the following example: 

Variable
V1
V2
V3
V3
I
44
64
440
460
Ii
44
64
440
460
Iii
44
64
440
460
Iv
46
64
460
480
v
48
64
480
500


2) Normalization is often used interchangeably with the process of “Standardization”. However, they can mean different things. While Normalization leads to a scale with values between 0 and 1, Standardization converts the data so as to have a standard deviation of 1 and a mean of zero. 


3) Here the linear transformation converts the variables to corresponding values with a mean of zero and standard deviation of 1. Reducing the variables to z-scores, will yield:

Variable
V1
V2
V3
V4
i
-.75
-.75
-.75
-.75
Ii
-.75
-.75
-.75
-.75
Iii
-.75
-.75
-.75
-.75
Iv
0.50
0.50
0.50
0.50
v
1.75
1.75
1.75
1.75


4) It should be understood that that all the normalized values are basically just deviations from the mean with E relatively quiet higher than the mean and D just slightly being so.

5) Z- Scores are very important in statistics and are very common. They allow us to compare different sets of data using standardized tables called z-tables.


Ratio Scale

1) Like Interval scales, ratio scales rely on proportionality transformation with the equation Y=mX.

Variable
V1
V2
V3
V4
i
44
440
22
24
Ii
44
440
22
24
Iii
44
440
22
24
Iv
46
460
23
26
v
48
480
24
28


1) For the purpose of normalizing a ratio scale, performing a similarity transformation, by dividing each value by square root of the addition of squares of all the original values. Here by rescaling data to have values between 0 and 1, the result would be : 

Variable
V1
V2
V3
V4
i
0.44
0.44
0.44
0.43
Ii
0.44
0.44
0.44
0.43
Iii
0.44
0.44
0.44
0.43
iv
0.45
0.45
0.45
0.46
v
0.47
0.47
0.47
0.50


2) An important point here would be that all the values correspond but the last column depicting that it measures something else. 


Difference Scale (aka Additive Scale)

1) This type of additive scales correspond with "translation" transformation with the equation Y = X + b. This means that here the resulting values will be as valid after measuring variables and adding a constant to each as the original ones. Here the ratios in between are not affected. 

Variable
V1
V2
V3
i
22
12
11
Ii
22
12
11
Iii
22
12
11
iv
23
13
11.5
v
25
15
12


1) Here, we perform translation transformation that in turn creates a variable yielding in mean of zero by just subtracting the mean of the original values yielding: 

Variable
V1
V2
V3
i
- 0.8
-0.8
-0.3
ii
-0.8
-0.8
-0.3
iii
-0.8
-0.8
-0.3
iv
0.2
0.2
0.2
v
2.2
2.2
0.7


2) An important point here would be that all the values correspond but the last column depicting that it measures something else.


Absolute Scale

  1. This type of scale corresponds to identity transformation with equation Y = X. This indicates that variables are completely unique and cannot be transformed.
  2. As a result, absolute-scaled variables cannot be normalized.


In conclusion, the main idea behind the process of Normalization is to remove redundant data, to make sure that respective dependencies make sense and enabling the correct alignment of entire probability distributions with the help of value adjustments. Its one of the most important topics on the basis of which several important data measuring methods are based.