What are you looking for?

Office Address

116/5, Shree Shantinagar Marga, Balaju, Kathmandu Nepal


Email Address

[email protected]

Module 2
Module 2
  1. Nominal variable
    • Dichotomous variable
    • Polychotomous variable
  2. Ordinal variable
  3. Matrix variable
    • Interval variable
    • Ratio variable

Nominal variables:
Nominal variables are those variables, which can be grouped into categories. It is always qualitative, discrete or countable so-called categorical. We can assign a name to such category but there is no inherent rank or order amongst the categories. It can be classified into two types:

  1. Dichotomous nominal variable:
    The nominal variables, which can be divided into two categories only. For example, rural/urban, diseased/non-diseased, cured/not cured, low birth weight present/low birth weight absent, etc. Apply for chi square test.
  2. Polychotomous nominal variable:
    Variables can be grouped into three or more categories are called polychotomous nominal variable. For example, religion, race, ethnicity, blood group, types of blood cells, etc.

Ordinal variables:
Those variables which can be grouped into categories and there is inherent rank or order amongst the categories. It can be either qualitative or quantitative variable. For example, division in class (I, II, III divisions), severity of diseases (mild, moderated, severe), staging of diseases, socio-economic status scale (upper, middle, lower), birth order and so on. Ordinal variables follow Likert scale. Response is graded on agree/disagree/continuum. Level of satisfaction is graded on continuum.

Matrix variables:
These are always continuous and always quantitative. Matrix variables are of two types:

  1. Interval variable:
    They do not have absolute zero i.e. quality is still present in zero. For example, temperature in Celsius i.e. at zero degree is some temperature, temperature in F, I.Q. levels (Intelligent Quotient), etc.
  2. Ratio variable:
    It has an absolute zero. Absolute zero means at numeric value of zero. If the quantity is absent, it means it has absolute zero. For example, height, weight, blood pressure, temperature in K, hemoglobin, blood sugar, serum cholesterol, etc.
  1. Qualitative data
  2. Quantitative data

Qualitative data
Qualitative data reflect characteristics or attributes, which can be counted but cannot be measured. For example, religion, gender, blood group, types of blood cells. It also includes number of births, deaths, cases, and pregnancies.

Data that is qualitative is discrete in nature. It can occupy only certain values and not in between that means can’t be expressed in decimals. The diagrammatic representation of qualitative data can be done by (BSP) bar graph, spot map or pie chart or pictogram.

Summarization of qualitative data can be summarized in the form of rate, ratio or proportion. BSP are used to denote the frequency distribution of discrete or qualitative data. Hence, pie chart is used to denote percentage or proportion. Test of significance is only applied to proportion. Most common used test is chi square test.

Quantitative data:
Reflects characteristics or attributes which can be measured. For example, height, weight, temperature, hemoglobin, blood sugar, serum cholesterol, blood pressure, etc. Data that is measurable is continuous. It can occupy any value i.e. can be expressed in decimals.

Diagrammatic representation can be done by histogram, frequency polygram, ogive, line diagram, or scatter plot. Ogive is cumulative frequency curve.
Histogram is used to denote frequency distribution of continuous or quantitative data. Line diagram is used to denote the time trend of continuous data. Secular trend or time trend of disease is denoted by line diagram. Scatter plot is used to show the relationship between two quantitative or continuous data.

Summarization of quantitative data can be done by Mean (test of significant can be applied), SD (test of significant can be applied) and Correlation coefficient (test of significant cannot be applied). T test or Z test can be applied for mean and SD but not for correlation coefficient

  1. Experimental/Interventional
    • Randomized control trial
    • Non-randomized control trial
  2. Observational
    • Analytical
    • Descriptive

Study design is an important component of a research. It basically depends upon whether the investigator has assigned an intervention or not. The word intervention, the decisive question, implies taking away something or giving something.

Intervention

Yes No
EXPERIMENTAL/INTERVENTIONAL STUDY OBSERVATIONAL STUDY
Randomization at the time of allocation Comparison group
Done Not done Present Absent
Randomized controlled trial Non randomized control trial (NRCT) Analytical study Descriptive study
  1. Patient: clinical trial e.g. drug trial
  2. Healthy individual: field trial e.g. vaccine trial
  3. Community trial
We cannot prove hypothesis We cannot prove but just make assumptions
These two are purely analytical studies:

  1. Case control study
  2. Cohort study
  1. Case study
  2. Case reporting
  3. Case series
  4. Surveillance
Comparison group, we cannot prove hypothesis by 1 group alone They have one group only
Cross-sectional study and ecological study can be either analytical or descriptive study

Descriptive study:
The following are considered descriptive studies.

  1. Case study/reporting: It is the presentation of clinical findings of a single patient in the form of a report. Here, single patient implies a case of a disease.
  2. Case series: It is presentation of similar clinical findings of a group of patients in the form of a report. We have a series of cases with similar character findings.
  3. Surveillance: It is defined as continuous scrutiny of all aspects of a disease pertinent to its effective control.

Purely analytical study:

  1. Case control study
    Here, we select a group of people with disease as cases and a closely matched group without disease as control or comparison group. Matching is done except disease status to ensure comparison group. We go background and take history of exposureCase control is always retrospective study. The advantages of case control study include easy to conduct, less time consuming, less expensive. The biggest strength is no problem of loss to follow up or attrition from the study or drop out.It is the only design suitable for rare diseases. Multiple types of exposure resulting in a single disease outcome can be studied.Disadvantages of case control study include the following. It is full of bias mainly recall bias and selection bias. Temporality of association cannot be established because of retrospective design. Natural history of disease cannot be studied. For example, if we do not intervene or do treatment, what is the course the disease going to follow is not known.Incidence cannot be calculated and hence relative risk and attributable risk cannot be computed. We only can assess odds ratio from a case control study, which is an inferior measure of strength of association as compared to relative risk.
    Odds ratio: also called cross product ratio

    Case (D+) Control (D-)
    Exposure (E+) a b
    No Exposures (E-) c d

    =ad/bc
    a: cases exposed to factor
    b: control exposed to factor
    c: cases not exposed to factor
    d: controls not exposed to factor

  2. Cohort study:
    The design is prospective, forward looking, whereby cause to effect analysis is made. It is usually longitudinal or follow up study.But we need to note that all the cohort studies are longitudinal studies but all the longitudinal studies are not cohort studies. Longitudinal study is cohort study only when we follow Exposure present (E+) and exposure absent (E-) to see for presence of outcome or disease (D). if we follow only one group, it is not cohort study. (but cross-sectional study as described above)

Advantages of cohort study:

  • There is no recall bias
  • Temporality of association can be established cause preceded effect
  • Natural history of disease can be established.
  • Single type of exposure resulting in multiple disease outcomes can be studied.
  • We will be able to calculate incidence and hence relative risk and attributable risk (AR) can be computed. Relative risk (RR) is better than odds ratio.

Note: Cohort study is necessary to calculate AR and RR as we require incidence in exposed and incidence in unexposed groups. But for simple calculation of incidence, cohort study is not required.

Disadvantages of cohort study:

  • More time consuming and more expensive.
  • There will be always problem of loss of follow up.
  • Howthorne effect: it is the problem with prospective or concurrent cohort study. It means people may alter their behavior during their course of follow up. Smokers may become non- smoker and vice versa. So, RR at the end of the study may not be a true value.

Some exceptions:
Cohort study is not always prospective. It can have variants as well.

  1. Nested case control study: type of cohorts study where design is prospective.
  2. Retrospective or historical cohort study: design is retrospective
  3. Mixed cohort study: start like retrospective cohort and continue in future as prospective study.

Matching:
Matching is done to make groups comparable on baseline characteristics, which decreases confounding in the study.
Except for exposure status making sure that none of these groups should be having disease or outcome at the time of starting study. The disease/outcomes follow in different time points.

Either analytical or descriptive study

  1. Cross sectional study:
    It is used to calculate prevalence. It is also known as prevalence study. Here, we examine a group of people of a cross section of time or at one point of time only. If we examine the same group of people again and again at different time points, it is called longitudinal of follow up study, which gives new cases/incidence of disease in population.
  2. Ecological study:
    These are weak study design because population is taken as the unit of the study. Ecological study only gives group characteristics and not individual characteristics. We take population as unit of study. We pick out one group characteristic and correlate the character with occurrence of disease or a phenomenon in that population. The result of ecological study is valid only at the level of population but not at individual level. If we draw conclusion that all individual with high fat intake with disease, it is bias called ecological falasy. Hence, ecological falasy is ascribing to persons group characteristics, which they may or may not possess as individual.
  1. Absolute risk
    • It is incidence of disease
    • It shows out of people following how many developed disease.
  2. Relative risk/risk ratio:
    • Incidence in exposed (E+)/incidence in non-exposed (E-)
    • E+ is one type of absolute risk and E- is another type of absolute risk.
    • RR answers how many times the exposed is at higher risk of the disease as compared to the non-exposed groups
    • Cohort study is required
    • If RR between smoking and lung cancer is 20, we conclude that smokers are 20 times higher at risk of developing lung cancer. It does not mean smoking causes cancer of lung.
  3. Odds ratio or cross product ratio:
    • From case control or retrospective study.
    • OR=ad/bc
    • In case of rare disease, OR is approximately equal to RR so case control study is better for rare disease.
    •  Interpretation
      • OR or RR=1 means no association between exposure and outcome. RR= incidence in E+/Incidence in E-, so RR=1 when both the values are same in numerator and denominator.
      • OR or RR > 1: there is positive association between exposure and outcome. This implies x many times higher in exposed than unexposed.
      • OR or RR<1: there is negative or protective association. RR<1 when incidence is less in E+ than in incidence in E-.Which has more risk among the following groups?
        Groups OR 95% CI
        A 2.5 1.1 – 3.1 (does not include 1 in the range)
        B 1.4 1.1 – 1.2 (does not include 1 in the range)
        C 1.6 0.9 – 1.7 (it includes 1 in the range)

        The answer is A.
        First look at the absolute value of OR (can also be for RR). Do not decide merely by observing OR. Also look at Confidence interval (CI). If 1 is included in the CI, then there is no association. Further, if CI includes less than 1, then there can be negative association or protective rather than significant association.In group C: OR <1 i.e. 0.9 implies protective factor. If 1 is included in 95% CI, then there is no association.In-group A: 1 is not included in the interval, so there is definitely association between exposure and outcome. The absolute value of OR is 2.5 which is more.In-group B: 1 is not included in CI. There is positive association but absolute value is less i.e. 1.4.If we change the value in CI as follows, the answer changes accordingly.

        Groups OR 95% CI
        A 2.5 1 – 3.1 (it includes 1 in the range)
        B 1.4 1.1 – 1.2 (does not include 1 in the range)
        C 1.6 0.9 – 1.7 (it includes 1 in the range)

        Here, B has more risk.

  4. Attributable risk or risk difference:
    • It answers to what extent is the exposure responsible for the disease in exposed group.
      AR=(Incidence in E+ minus Incidence in E-) x 100/incidence in E+
    • If AR between smoking and lung cancer is 80%, what do you conclude? It means 80% of lung cancer in smoking is because of smoking. Even in smokers, 20% other risk.
    • If RR is given, how can we calculate AR? AR=1-1/RR.
    • Population attributable risk (PAR):
      • It means that if we eliminate the risk factors from the population by what percentage the incidence of disease declines in that population.
      • If PAR between smoking and lung cancer is 40 %, what does it mean? It means if people stop smoking, the incidence of lung
      • cancer in the population decreases by 40 %.
        PAR is most important measure of risk from policy makers or public health point of view.

HYPOTHESIS TESTING:

Null hypothesis:

  • By convention, we start with null hypothesis
    1. In analytical study: it says there is no cause effect relationship (null means no)
    2. Experimental study: no difference between Drug A and Drug B (between intervention)
  • We start study and if results also show there is no difference, we will accept the null hypothesis
  • If the study shows drug A to be better than drug B, then we will reject the null hypothesis. Then accept alternate hypothesis, which is kept ready. The alternate hypothesis is just opposite to null hypothesis.

Alternate hypothesis:
It states that

  1. In analytical study: there is cause effect relationship
  2. In experimental study: there is a difference between the interventions.

Null hypothesis:

  • By convention, we start with null hypothesis
    1. In analytical study: it says there is no cause effect relationship (null means no)
    2. Experimental study: no difference between Drug A and Drug B (between intervention)
  • We start study and if results also show there is no difference, we will accept the null hypothesis
  • If the study shows drug A to be better than drug B, then we will reject the null hypothesis. Then accept alternate hypothesis, which is kept ready. The alternate hypothesis is just opposite to null hypothesis.

Alternate hypothesis:
It states that

  1. In analytical study: there is cause effect relationship
  2. In experimental study: there is a difference between the interventions.