Effect Sizes for Categorical (Binary/Dichotomous) Variables — Categorical Association Family
- This family is designed specifically to
capture relationshipsbetweentwo or more categorical variables.
In Confidence Interval [3] , we studied the Proportion Z-Test (Single-Sample and Two-Sample). Here we continue on the idea of comparing categorical data 🤘🏻
- We want to use effect size to quantify the
strengthormagnitude of a relationshipbetween variables. - We studied the difference family, but now our values represent categories rather than numeric values — So we need a different family
Categorical Association Family
Categorical Association Family - Why There are many:
Different measures are optimized for:
- Different table sizes (2×2 vs larger tables),
- The measurement level of the variables (nominal vs ordinal),
- And specific research questions (e.g., comparing relative risks vs testing independence).
Multiple Ways to Express Relationships:
Even for a simple comparison like a 2x2 table, the relationship can be expressed in different, non-equivalent ways, each offering a unique perspective:
- Difference: How much greater is the proportion in one group (Risk Difference)?
- Ratio: How many times more likely is an outcome in one group (Risk Ratio, Odds Ratio)?
- Association: How strong is the overall relationship between the two variables, often standardized like a correlation (Phi coefficient, Cramér's V)?
Different fields or research questions favor different metrics.

Probabilities, Odds, and Log Odds
Odds vs Probability
- Actually it’s very simple and not much different than the probability. It’s just different way to report same meaning.
- Uncertainty or Chance can be expressed either as a
probabilityor asodds. - Odds in statistics is the relative likelihood of an outcome for each side of a competition.
- Odds ranges from 0 to
- Example: 100 people took a medicine (80 got better, 20 did not)
ProportionorProbabiliity= Number who got better / Total —> Prob = 80/100 = 0.8Odds= Number who got better / Number who didn’t —> Odds = 80/20 = 4- Probability says: “80% of people got better.”
- Odds says: “For every 1 person who didn’t get better, 4 did.”
Tip:
- Use proportion (%) when you're speaking about risk or chance.
- Use odds when you're comparing two groups (e.g., Odds Ratio in case-control studies).
- Note that the odds of one outcome can be expressed as the reciprocal of another.
- Odds of getting better is 4
- Odds of not getting better is 1:4
- If you get a probability of 100%, that means the odds is 1:0, which is
- A probability of 0 is the same as odds of 0.
- Probabilities between 0 and 0.5 equal odds less than 1.0.
- A probability of 0.5 is the same as odds of 1.0. (Flipping Coin 50/50) (equiprobable)
- Probabilities sum up to 1, but odds not.
Conversion:
- In reality, the odds are just a re-expression of probability.
- Odds are simply the probability that outcome A happens over the probability that it is not.
- Odds is

- Probability is
- Odds is very common term in Finance, Sports Betting and Gambling.
Log Odds (logits)
- If the odds are against me (Which means probability less than 50%), then the odds is between 0 and 1.
- If the odds are for my favor me (Which means probability greater than 50%), then the odds is between 1 and .
- This is an asymmetry makes it difficult to compare the odds for [0,1) or against (1, me.
- Taking the log of the odds —> Solves the problem by making everything symmetrical around ZERO.
- Now the range is to
- Using the log, we can easily interpret the sign of the log odd, and compare easily.
- Note that LOG, converts Division into Subtraction — That’s why we get a symmetrical interval. ← VERY IMPORTANT
- Consider 6:1 as the odd.
Log(6:1)is0.778andLog(1:6)is-0.778 - —> This is called the quotient rule for logarithms. Symmetric Results both Directions.
- Consider 6:1 as the odd.
- Log Odds or Logits are very important in Logistic Regression.
- We assume that no base for means we are using , so it’s .
- Another reason to use log odds is that it is usually difficult to model variables with restricted ranges, such as probabilities.
Log Odds in Logistic Regression
- Since Logistic regression is mainly binary classification, we want to have the output as Probability between 0 and 1.
- A standard linear regression model looks like: Y=β0+β1X1+β2X2+...+βnXn. The output (Y) of this linear combination can range from negative infinity (−∞) to positive infinity (+∞).
- Logistic Regression doesn’t model the probability (P) as the output (Y)
- If we tried to directly model probability (P) using this linear equation (P=β0+β1X1+...), the model could predict probabilities less than 0 or greater than 1 for certain values of the predictors (Xi). This is makes no practical sense for P.
- To overcome the boundary issue, logistic regression transforms the probability using a function that maps the interval [0, 1] to the entire real number line (−∞,+∞). This allows us to use a linear combination of predictors.
- This what we just studied above about Odds, and Log Odds!
- The range of Odds is [0,+∞). This is better, but still not the full real line (it doesn't go to −∞).
- The range of Log-Odds is (−∞,+∞). This matches the potential output range of the linear combination of predictors 😁
The logit function is the inverse of sigmoid.
- IMPORTANT: That’s why in Logistic Regression, if you use the output directly to compare +ve or -ve, you will look if <0 or >0, but if you apply sigmoid you will look <0.5 or >0.5.
Exponentiating Coefficients for Better Interpretability
- As our linear equation generates log odds, so βi is the change in log-odds. Can we translate that into something easier to understand?

- Based on the final line we can now create odds ratio, which is odds of something divided by odds of something else.
- Odds Ratio is explained in the next section.
- This factor, , is called the Odds Ratio (OR).
- Example: You run an online store. You want to model the likelihood a user buys a product based on:
time_on_site.Suppose your logistic regression model gives this coefficient: β₁ = 0.693- OR = e^(β₁) = e^0.693 ≈ 2
- For every extra minute someone spends on the site, the odds of buying the product double (are multiplied by 2)!
- Let’s say a user currently has a 20% probability of buying. That means Odds = p / (1 − p) = 0.2 / 0.8 = 0.25. If they stay 1 more minute, and we apply the odds ratio: New odds = 0.25 × 2 = 0.5, the probability will be p = 0.5 / (1 + 0.5) = 0.333 (or 33%). That means one extra minute bumped their chance from 20% to 33% — that’s a big jump, just from a small β₁ = 0.693.
- What if we flip, and it was β₁ = -1.1 (With a -ve sign)?
- OR = e^(-1.1) ≈ 0.33
- Each extra minute spent actually reduces the odds of purchase to 1/3 of what they were.
- Maybe the user is wasting time = less likely to buy
[1] Odds Ratio (OR):
The Odds Ratio (OR) is the ratio of two different odds.
Definition:
Describes the odds of an outcome occurring in one group relative to another.
Categorical Exposure vs Categorial Outcome
- In OR and RR, we have two things: Categorical Exposure vs Categorical Outcome
| Categorical Exposure (X) | Categorical Outcome (Y) |
|---|---|
| Intervention vs Control | Cancer vs No Cancer |
| Smokers vs Non-Smokers | Died vs Alive |
| Over 30 BMI vs Below 30 BMI | Dies vs Survives |
| Male vs Female | Depression vs No Depression |
| Treatment vs No Treatment | Passed Exam vs Failed |
| Referred by Friend vs Not | Subscribed vs Not Subscribed |
Example: Smoking and Lung Cancer
- You conduct a study on 200 people to see if there's a relationship between smoking and lung cancer.
Lung Cancer No Lung Cancer Total Smokers 40 60 100 Non-Smokers 10 90 100 - Odds (Alone): We would say: Among smokers, the odds of lung cancer are 0.67 (40/60)
- Odds (Alone): We would say: Among non-smokers, the odds of lung cancer are 0.11 (10/90)
- Odds Ratio: We would say: Smokers have 6.1 times higher odds of lung cancer compared to non-smokers.
- This is from: Odds for Smokers / Odds for non-smokers = 0.67/0.11 = 6.1
- Odd ratio is an effect size measure. Large value of Odds Ratio means that smoking is a BIG indicator of lung cancer
- Usually in medical and health research, social, and risk studies.
- Odds Ratio is not a Single-Group Study — It’s comparing two groups.
- Odds Ratio = how the odds compare between groups
- Odds Ratio is very important in Logistic Regression as we saw in {
Exponentiating Coefficients for Better Interpretability} in previous section. - Odds Ratio is still asymmetrical.
- Log(Odds Ratio): Just like the odds, taking the log makes things
niceandsymmetrical.
[2] Risk Ratio (RR) (Relative Risk):
- It’s very simple and naive — It proportion over proportion instead of odds over odds.
- Proportion == Probability == Risk
The Relative Risk (RR) is the risk of the outcome in an experimental group relative to that in a control group.
- Risk Ratio calculates the ratio between the proportion of cases in the treatment group and the proportion of cases in the control group
- Risk Ratio is an effect size because it is Unit Free.
CTests to do? Relation to Cross Entropy, Control Study number of patients, rare diseases, causation reasons, odds ratio reversed, poisson for ODDs Ratios CI?, Meta Analysis what is Equality of Odds https://mlu-explain.github.io/equality-of-odds/
(Cross Entropy)
Example: New Drug to Prevent Flu
| Group | Got Flu | Didn't Get Flu | Total |
|---|---|---|---|
| Treatment (Drug) | 10 | 90 | 100 |
| Control (No Drug) | 30 | 70 | 100 |
- Step 1: Calculate Risk in each group (Getting Flu):
- Step 2: Calculate Risk Ratio:
People who took the drug were 67% less likely to get the flu.
Odds Ratio AND Risk Ratio
- The 95% confidence intervals and statistical significance should accompany values for RR and OR.

How to move from Association / Correlation to Causation?
[2] Risk Difference (RD):
- Risk Ratio (Relative Risk, RR):
- The ratio of the probability of an event occurring in the exposed group to the probability in the control group.
- Risk Difference (RD) / Absolute Risk Reduction (ARR):
- The absolute difference in the probability of an event between groups.
- Hazard Ratio
- Other Association Measures:
These are used particularly for 2×2 tables or more general contingency tables:
- Phi Coefficient (φ):
- A correlation measure for binary variables.
- Yule’s Q and Yule’s Y:
- Alternative transformations of the odds ratio for interpretability and symmetry.
- Cramér’s V:
- A measure of association for contingency tables larger than 2×2.
- Contingency Coefficient:
- Another statistic that assesses the strength of association in a contingency table.
- Phi Coefficient (φ):
- Diagnostic and Classification Specific:
- Diagnostic Odds Ratio (DOR):
- Used in diagnostic test evaluations.
- Diagnostic Odds Ratio (DOR):
Effect Sizes for Correlation Measures
These statistics quantify the strength and direction of the relationship between variables without assuming a cause-effect direction.
- Pearson’s r:
Measures the linear correlation between two continuous variables.
- Other Correlational Approaches for Continuous Data:
- Spearman’s ρ: A nonparametric measure of rank correlation.
- Kendall’s Tau (τ): Another rank-based measure useful especially for smaller samples or data with ties.
- Coefficient of Determination (R²):
While often used in regression, it indicates the proportion of variance in one variable explained by another. It is conceptually related but should be treated separately from simple correlation coefficients.
Effect Sizes in Regression and Model-Based Analysis
These measures are often used to gauge the practical significance of predictors:
- Coefficient of Determination (R²):
- The proportion of variance in the dependent variable explained by the independent variables.
- f² (Cohen’s f-squared):
- An effect size for multiple regression indicating the change in R² relative to the unexplained variance.
- Standardized Regression Coefficients (Beta weights):
- Allow comparison of effect sizes across predictors measured on different scales.
- Semi-partial (or Part) Correlations:
- Reflect the unique contribution of each predictor in explaining variance.
Effect Sizes for Agreement and Reliability Measures
Used in contexts where the consistency or reproducibility of ratings or observations is evaluated.
- Cohen’s Kappa (κ):
Assesses inter-rater agreement for categorical items beyond chance.
- Intraclass Correlation Coefficient (ICC):
Evaluates the reliability of measurements or ratings when more than two raters or observations are involved.
- Krippendorff’s Alpha:
A versatile measure applicable to various data types, including nominal and ordinal.
