Discrimination in the Workplace

 

Part I: Downsizing at a Computer Firm  (15 points)

 

The chi-square results are produced below.  Although the negligible p-value suggests a strong dependence between race and the decision to lay off an employee, this data alone is insufficient from which to conclude that discrimination occurred.  The chi-square test cannot take into account factors other than race which may have contributed to the decision to lay off an employee.  For example, if seniority in the firm is highly correlated with race (so that employees with longer tenure with the company tend to be more likely to be white) then decisions that are based on employee seniority may in fact appear to be based on race.

 

Chi-Square Test

 

 

Expected counts are printed below observed counts

 

         white    black    Total

    1     1051      113     1164

       1036.58   127.42

 

    2       31       20       51

         45.42     5.58

 

Total     1082      133     1215

 

Chi-Sq =  0.201 +  1.631 +

          4.577 + 37.232 = 43.641

DF = 1, P-Value = 0.000

 


Part II: Age Discrimination (20 points)

(There are several ways to do this; below is just one example.)

 

Exhibit A

 

Tabulated Statistics

      Count

      (Expected Frequency)

 

 

Active

Terminated

Total

Age < 40

18

(14.09)

7

(10.91)

25

Age ³ 40

13

(16.91)

17

(13.09)

30

Total

31

24

55

 

Chi-Square = 4.556, DF = 1, P-Value = 0.033

 

The chi-square analysis indicates that, were age unrelated to the decision to terminate an employee, the expected number of employees under age 40 who would be terminated is nearly 11, while the expected number over age 40 would be 13.  Instead, only 7 employees under the age of 40 were terminated, while 17 over the age of 40 were terminated.  The p-value of 0.033 means that the probability of this outcome, if in fact the decisions were made at random (that is, if the decision to terminate was independent of the employee’s age) is only 0.033.

 

Exhibit B

 

Tabulated Statistics

      Count

      (Expected Frequency)

 

 

Active

Terminated

Total

Wages < mean

22

15.22

5

11.78

27

Wages ³ mean

9

15.78

19

12.22

28

Total

31

24

55

 

 

Chi-Square = 13.605, DF = 1, P-Value = 0.000

 

The above chi-square analysis shows that the real relationship is between wages and status.

 

Which is more convincing:

            Clearly the plaintiffs have the better case here, at least given the data that we have.  The notion that termination decisions were made based on salaries rather than age does not pass the proverbial “giggle test”, since salaries are highly correlated to age (the correlation coefficient here is 0.964 between the two).  Furthermore, they specifically claimed that the decisions were made randomly.

“Wringing” the Bell Curve (15 points)

 

a.                  Comment on each of the problems –

Problem 1: If there are interactions between the independent variables included in the regression that are not accounted for in the model, the b’s and associated significance values cannot properly be interpreted.  For example, IQ, socioeconomic status, and age may very well interact with one another in producing an effect on income.

 

Problem 2: The positive value of the b coefficient on the IQ variable tells us something about an association between IQ and income.  Such an association cannot properly be interpreted as evidence of causality.  It may be that IQ is positively correlated to other factors not included in the model (ie, drive or ambition) that is in fact what leads to the higher value of income.

 

Problem 3: By normalizing the predictor variable IQ, the relationship between IQ and the outcome variable could be affected.  That is, a relationship might “artificially” appear to exist when in fact it does not.  (This is really a very picky point, though.)

 

Problem 4: If educational attainment and IQ are collinear – as seems likely – then the choice to include one but not the other in the model ignores the effect that the unincluded variable has on the dependent variable.  That is, the “true” effects of the variables overlap, but by including only one of the variables, the effects of both are attributed to the one that is included.

 

b.                  Propose a different model –

Different specifications of the model might include interaction effects and test for the significance of the coefficients on the interaction variables.  Also, including level of education in the model and testing significance of the coefficient with and without including IQ in the model would be useful.  Any model should be tested for its overall usefulness in predicting income via an F-test.
Chattergee (10 points)

 

Regression Analysis

 

 

The regression equation is

LogUsage = 4.31 - 1.60 LogTemp

 

Predictor        Coef       StDev          T        P

Constant       4.3083      0.1819      23.69    0.000

LogTemp       -1.5989      0.1059     -15.09    0.000

 

S = 0.1051      R-Sq = 81.1%     R-Sq(adj) = 80.8%

 

Analysis of Variance

 

Source            DF          SS          MS         F        P

Regression         1      2.5167      2.5167    227.77    0.000

Residual Error    53      0.5856      0.0110

Total             54      3.1023

 

Unusual Observations

Obs    LogTemp   LogUsage         Fit   StDev Fit    Residual    St Resid

  3       1.76     1.2858      1.5009      0.0149     -0.2152       -2.07R

 45       1.74     1.2868      1.5257      0.0145     -0.2389       -2.29R

 47       1.86     1.0176      1.3387      0.0210     -0.3211       -3.12R

 54       1.38     2.0056      2.1016      0.0378     -0.0960       -0.98 X

 

R denotes an observation with a large standardized residual

X denotes an observation whose X value gives it large influence.

 

Predicted Values

 

     Fit  StDev Fit         95.0% CI             95.0% PI

  1.5919     0.0142   (  1.5634,  1.6205)  (  1.3792,  1.8047)  

 

 

LogUsage = 1.5919 Þ Usage = 39.075 kwh


11.27   (30 points)

            a.         E(y) = b0+b1x1+b2x2+b3x3+b4x4+b5x5

b0 is the constant term.  b1 gives the additional salary attributable to being male; b2 the additional salary attributable to being white; b3 the additional salary per year of education; b4 the additional salary per year with the firm; and b5 the additional salary per hour worked per week.

 

            b.

 

Interpret b’s as above.

c.                   R2 = 0.240 suggests that approximately 24% of the variation in salaries is explained by the variation in the independent variables included in the model.


 


For a=0.05, Fa= 2.42, so reject H0.  (Alternatively, p=0.0000...)

d.                  H0: b1=0, HA: b1>0.  p = 0.025, so we will reject H0.

e.                   A discrepancy in the salary figures alone is not enough to conclude that the difference stems from gender discrimination.  For example, if women in the sample are generally less educated than the men in the sample, or have been at the firm for fewer years, we would expect them to receive lower salaries for these reasons.  This would not be indicative of discrimination.  By controlling for these factors, we can make inferences about the salary discrepancies that “control” for these other influences.

f.                    If gender and tenure with the firm interact, the b coefficients on these variables will be biased.  That is, the model does not account for the effect of a change in one of these variables being a function of the other variable.