A team of researchers are studying the determinants of the practice of untouchability among Indians. They use the second round of the Indian Human Development Survey for this. Their (simplified) estimated model is (standard errors in brackets):
untch = 0.728 − 0.645urban − 0.0081hh_edu + 4 × 10−7 fam_inc − 0.00004hh_edu.urban
(0.26) (0.113) (0.009) (1.89 × 10−7) (0.00047)
n = 33, 319 R2 = 0.0998
untch = a dummy for practicing untouchability,
untch=1 if household practices untouchability, 0 otherwise.
urban = dummy for urban residence,
urban=1 if urban, 0 otherwise.
hh_edu = years of education of the household head.
Fam_inc = annual family income (in rupees).
(1) What is the probability of a rural household practicing untouchability which has an illiterate household head and an annual family income of 3.5 lakhs? Here we have to apply the linear probability model.
(2) By looking at the coefficients and standard errors, would you say that households with more educated heads residing in urban areas have a lower probability of practicing untouchability as compared to a similar household in rural areas? State your reasons clearly.
(3) Suppose you also wanted to check if the intercept differed by the caste of the household. A household in the data can belong to one of the four administrative caste categories: Scheduled caste (SC), Scheduled tribe (ST), Other Backward Classes (OBC) and General (GEN). How would you modify the population model underlying equation (i.e., in terms of population parameters)? State clearly what variables you will add and their interpretations.
(4) Suppose now that fam inc is measured in thousands of rupees instead of rupees. How will the estimated equation change due to this scale change? Rewrite the estimated equation indicating the change in each coefficient and their standard errors. Also indicate any changes in the R2.
(5) One of the factors influencing the practice of untouchability is a cultural norm of untouchability. Even highly educated and well-off households find it difficult to go against the cultural norm of untouchability, more so in rural areas, than in urban areas on average. If urban is your main variable of interest, then what direction of bias do you expect in the estimate of the coefficient on urban if the strength of a cultural norm for untouchability is an omitted variable? (Assume for this question that urban is the only independent variable included in the model).