Module 4
- Mean:
- Sum of all observations/no of observations
- Median:
- It is the value which occupies the middle position when the data is
arranged in ascending or descending order - Not all individual data is considered
- We are just concerned with value in middle position.
- It is the value which occupies the middle position when the data is
- Mode:
- It is most frequently occurring value in the distribution
- It is not concerned with all individuals.
Extreme value/outlying value in the distribution /outlier:
- Means a value, which is either much larger or much smaller than the rest of the values in the distribution.
- For example: 110, ……140, 160, 165, 155, 170, …195. 110 and 195 are outlier.
- If there is extreme value means will be affected as we have to include it while calculation
- Whenever, extreme value or outlier is present in distribution, mean is the measure of central tendency
- Extreme value will pull the value towards itself
- If small: lower down the mean in the data
- If large: increase the mean in the data
- Q statistics or Q test is used to know if a value is outlier. It is used for identification and rejection of an outlier. Then we can either include or exclude in the analysis.
- If extreme/outlier is NOT present, we get normal distribution curve.
- As per presence of outlier:
- Present: Asymmetrical or skewed distribution
- Absent: Normal or symmetrical distribution.
- Range: It is the difference between the maximum and the minimum values.
- SD: formula needs to be seen
- It is also called root mean square deviation (RMS Deviation)
- SD α 1/n i.e. SD is inversely proportional to the size of the sample or total number of observations. The SD increases, as the number of observations decreases (from formula)
- Each of 10 babies born on a day had birth weight of 2.8 kg. What is the SD in the birth weight?
- The answer is zero. Here the mean = 2.8, range =0, SD=0, variance = SD 2 = 0
- In this situation, the value of variation is zero i.e. any measure of deviation will be zero.
- Variance: It is SD^2
- Standard error
- Coefficient of variation
It is practiced at two different stages
- At selecting subjects/recruitment called random selection. For example, class of 50 people, I need to recruit 10 persons for my study
- First approach: people in front row: non random selection as each member is not given chance on back benchers had no chance
- Second method: lottery method: random selection as each member had chance of being included
- At the time dividing study to two group called random group allocation: out of 10 people I need to divide into 2 groups
- Non-random group: People sitting to left drug A and to right to drug B
- Randomization: computer generated list for drug A and drug B
Important note: I may have done random selection, but randomization means in second stage at the time of group allocation.
- Single blinding: Study subjects are not aware of group allocation
- Double blinding: Study subjects and investigator are not aware what is group A and what is group B. This is most common form of blinding.
- Triple Blinding: Study subjects, investigator and data analyst are not
aware of groups A and B. Entire study is coded and once study is over,
it will be decoded. This is the best form of blinding.