# Dummy Variable Regression

• 1. Dummy Variable Models
• 2. “ Using Dummy Variables in Wage Discrimination Cases” Multiple RegressionSandy:pages 603 - 613 Alsoreadpapertitled:
• 3. Are Male Nurses Discriminated Against? malenurses0 femalenurses Years of experience, X i W f _  4 ^ W m _  3 ^ ~ m W  3 ~ W f ~  4 ~   ~ adjustedforexperiencenotadjustedforexperienceo o o o o o o o o o o o + + + + + + + + + + + + + + + + + + + + + + + + + o o o   ~
• 5. Intercept Dummy Variables Dummy variables are binary (0,1) D t= 1 ifredcar, D t= 0 otherwise. y t = 1 + 2 X t+ 3 D t+e t y t =speed of car in miles per hour X t =age of car in years Police:redcars travel faster . H 0 :  3= 0 H 1 :  3> 0
• 6. y t = 1 + 2 X t+ 3 D t+e t redcars :y t =(  1+ 3 ) + 2 X t+e t other cars :y t = 1+ 2 X t+e t y t X t miles perhour age in years 0  1+ 3  1  2  2 redcars other cars
• 7. Slope Dummy Variables y t = 1 + 2 X t+ 3 D t X t+e t y t = 1+ (  2+ 3 )X t+e t y t = 1+ 2 X t+e t y t X t value of porfolio years 0  2+ 3  1  2 stocks bonds Stock portfolio: D t= 1Bond portfolio: D t= 0  1= initial investment
• 8. Different Intercepts & Slopes y t = 1 + 2 X t+ 3 D t + 4 D t X t+e t y t =(  1+ 3 ) + (  2+ 4 )X t+e t y t = 1+ 2 X t+e t y t X t harvest weight ofcorn rainfall  2+ 4  1  2 “ miracle” regular “ miracle” seed: D t= 1regular seed: D t= 0 1+ 3
• 9. y t= 1+ 2X t+ 3 D t + e t  2  1 + 3  2  1 y t X t Men Women 0 y t = 1+ 2X t + e t Formen  D t = 1. Forwomen  D t= 0. years of experience y t = (  1 + 3 ) + 2X t + e t wage rate . . Testing for discrimination in starting wage H 0 :  3 =0 H 1 :  3 >0
• 10. y t= 1+ 5 X t+ 6 D tX t+ e t  5  5+  6  1 y t X t Men Women 0 y t = 1 + (  5+  6 )X t + e t y t= 1 + 5X t + e t For menD t= 1. For womenD t= 0. Men and women have the samestartingwage, 1, buttheirwage rates increase at differentrates(diff.= 6 ).  6 >  means thatmen’s wage rates are increasingfasterthanwomen's wage rates. years of experience wage rate
• 11. y t= 1+ 2X t+ 3D t+ 4D tX t+ e t  1+ 3  1  2  2+ 4 y t X t Men Women 0 y t= (  1+ 3 ) + (  2+ 4 ) X t+ e t y t= 1+ 2X t+ e t Women are given a higher starting wage, 1 ,while men get the lower starting wage, 1+ 3 , (  3 <0 ).But, men get a faster rate of increase in their wages, 2+ 4 , which is higher than the rate of increase for women, 2, (since 4 >0). yearsofexperience AnIneffectiveAffirmativeActionPlan women are started at a higher wage. Note : (  3 <0) wage rate
• 12. Testing Qualitative Effects
• 1.Test for differences inintercept .
• 2.Test for differences inslope .
• Test for differences in both
• interceptandslope .
• 13. H 0 :   vs  1 :   H 0 :   vs  1 :   Y t   1   2 X t   3 D t   4 D t X t b   3 Est . Var b  3 ˜ t n  4 b    4 Est . Var b  4 ˜ t n  4 men:D t= 1 ;women:D t = 0 Testing for discrimination in starting wage. Testing for discrimination in wage increases. intercept slope  e t
• 14. Why NOW wants one-sided test and Chauvinist Industries wants two-sided.
• 15. Are TwoRegressionsEqual? y t= 1+ 2X t+ 3 D t+ 4 D tX t+ e t variations of “The Chow Test”I.Assuming equal variances (pooling): men:D t= 1 ;women:D t = 0H o : 3= 4= 0vs.H 1 : otherwise y t= wage rate This model assumes equal wage rate variance. X t= years of experience
• 16. Testing  H o :       H 1 :otherwiseand SSE R   y t  b 1  b 2 X t  2 t  1 T  SSE U   y t  b 1  b  X t  b  D t  b  D t X t  2 t  1 T   SSE R  SSE U   2 SSE U   T  4   F T  4  intercept and slope
• 17. y t= 1+ 2X t+ e t II.Allowing for unequal variances: y tm= 1+ 2X tm+ e tm y tw= 1+ 2X tw+ e tw Everyone: Men only: Women only: SSE R Forcing men and women to have same 1 , 2 . Allowing men and women to be different. SSE m SSE w whereSSE U=SSE m+ SSE w F = (SSE R SSE U )/J SSE U/(T  K) J = # restrictions K=unrestricted coefs.(running three regressions) J = 2K = 4
• 18. Polynomial Terms y t= 1+ 2X t+ 3 X 2 t + 4X 3 t + e t Linear in parameters but nonlinear in variables: y t= income;X t= age Polynomial Regression y t X t People retire at different ages or not at all. 90 20 30 40 50 60 80 70
• 19. y t= 1+ 2X t+ 3 X 2 t + 4X 3 t + e t y t= income;X t= age Polynomial Regression Rate income is changing as we age : Slope changes asX tchanges.  y t  X t = 2+ 2 3 X t + 3 4X 2 t
• 20. Continuous Interaction y t= 1+ 2 Z t + 3B t+ 4 Z tB t + e t Exam grade = f(sleep: Z t , study time: B t ) Sleep and study time do not act independently. More study timewill be more effective when combined withmore sleepand less effective when combined withless sleep .
• 21. Your mind sorts things out while you sleep (when you have things to sort out.) y t= 1+ 2 Z t + 3B t+ 4 Z t B t + e t Exam grade = f(sleep: Z t , study time: B t ) Your studying ismore effective with more sleep . continuous interaction y t  B t = 2+ 4Z t  y t  Z t = 2+ 4B t
• 22. y t= 1+ 2 Z t + 3B t+ 4 Z t B t + e t Exam grade = f(sleep: Z t , study time: B t ) IfZ t+B t= 24 hours,thenB t= (24 Z t ) y t= 1 + 2 Z t +  3 (24 Z t ) +  4 Z t(24 Z t ) + e t y t= (  1 + 24  3 ) + (  2   3 + 24 4 ) Z t   4 Z 2 t + e t y t= 1+ 2 Z t + 3Z 2 t + e t Sleep needed to maximize your exam grade : where 2> 0and  3< 0  y t  Z t = 2+ 2  3Z t = 0  2  3 Z t =
• 23. Multicollinearity Correlation among the “ independent” variables. Note: They are independent of the error term, and not of one another.
• 24. Letyirepresenttheith person's wage rate andXirepresent their months of work experience in the equation: yi = b1 + b2 Xi + ei(1) b1 = intercept (starting wage) b2 = increase in the person'swage for each additional monthof work experience.ei = error term with mean zeroand estimated variances2.
• 25. yi=b1 + b2 Xi + b3Mi+ b4Fi+ ei(2) Fi= 1if female Fi = 0 ifmale . Mi= 1ifmaleMi= 0if female .
• 26. yi=b1 + b2 Xi + b3 Mi + b4Fi+ ei(2) Unfortunately this equation contains an underidentified set of parameters (b1, b3, and b4) and cannot be estimated without somerestrictionon the coefficients.
• 27. To see this point, separate out the men'sequation implied by equation (2)from thewomen'sequation. For themen'sequationMi=1 andFi=0.Formen , equation (2) becomes: yi=(b1 + b3) + b2 Xi + ei(3) yi=b1 + b2 Xi + b3Mi+ b4Fi+ ei(2)
• 28. Forwomen ,Mi=0 andFi=1. Forwomen , equation (2) becomes: yi=(b1 + b4) + b2 Xi + ei(4)
• 29. Unfortunately, although we get estimatesof the intercepts (b1 + b3) and (b1 + b4),the value of b1cannot be separatedfrom the values of b3 and b4. Somerestrictionis needed to achieveidentification of b1, b3 and b4.
• 30. One such restriction is b1 = 0. We can drop the original intercept term, b1, sincemenandwomenalreadyhave their own intercept terms,b3andb4 , respectively.
• 31. Underidentification of equation (2) can also be expressed in matrix terms.First, rewrite equation (2) putting the explanatory variables in a row vector multiplied by the corresponding column vector of their respective coefficients: y i    1  X i  M i  F i     2  3  4    i   5  1
• 32. This only represents the ithobservation where i = 1, ..., n. To represent the entire set of n observations at once, we need to&quot;pull the window shade down&quot; as follows: y 1 y 2 M y n  1 X 1 M 1 F 1 1 X 2 M 2 F 2 M M M M 1 X n M n F n  1  2  3  4   1  2 M  n (6)
• 33. Equation (6) presents us with an X matrixwhose first column (the column of ones)is an exact linear combination of the lasttwo columns (the M and F columns).Since Mi is always zero when Fi is equalto one and Mi is always one when Fi isequal to zero, then it always holdsthat Mi + Fi = 1. Therefore, the first column is equal to thesum of the last two columns.
• 34. Since Mi is always zero when Fi is equalto one and Mi is always one when Fi isequal to zero, then it always holdsthat Mi + Fi = 1.1 1 M 1  M 1 M 2 M M n  F 1 F 2 M F n ( 9 )
• 35. Equation (6) and, therefore,equation (2),represent a case of perfectmulticollinearity . This means that a restriction must beintroduced that drops one of these columnsout of the regression. One such restriction isb1 = 0 ,which means dropping the original intercept out of the regression model toprovide the following reduced model: yi=b2 Xi+b3Mi+b4Fi +ei(10) Nowmen and women have separate intercepts and no common intercept is necessary.
• 36. yi = b2 Xi + b3Mi+ b4Fi+ ei b2 b3 b2 b4 yi Xi Male Female 0 yi=b3+b2 Xi+ ei yi=b4+b2 Xi+ ei FormalesMi= 1andFi= 0. ForfemalesMi= 0andFi= 1. Malesandfemaleshavedifferent startingsalaries ,b3 > b4 , buttheir salariesincreaseatthesamerate, b2.
• 37. y i= b2 X i+ b3M i+ b4F i+ e i b2 b3 b2 b4 y i X i Male Female 0 y i =b3+b2 X i + e i y i =b4+b2 X i + e i FormalesMi= 1andFi= 0. ForfemalesMi= 0andFi= 1. Malesandfemaleshavedifferent startingsalaries ,b3 > b4 , buttheir salariesincreaseatthesamerate, b2. years of experience
• 38. y i= b1 + b5M iX i+ b6F iX i+ e i b6 b5 b1 y i X i Male Female 0 y i =b1 +b5 X i + e i y i =b1 +b6 X i + e i For malesMi = 1andFi = 0. For femalesMi = 0andFi = 1. Males and Females have the samestartingsalaryb1, buttheirsalaries increase at differentrates(b5vs.b6). b5 > b6 means thatmen salariesare increasingfasterthanwomen's salaries. years of experience
• 39. y i=b3 M i+ b4 F i+ b5 M iX i+ b6 F iX i+ e i b3 b4 For malesMi = 1andFi = 0. For femalesMi = 0andFi = 1. b6 b5 y i X i Male Female 0 y i =b3+b5 X i + e i y i =b4+b6 X i + e i Females start with a higher starting salary,b4 ,while men get the lower starting salary,b3 . But, men get a faster rate of increase in their salaries,b5 , which is higher than the rate of increase for females,b6 .(b5>b6). yearsofexperience Chauvinist Industries Affirmative Action Plan
• 40. y i= b2 X i+ b3M i+ b4F i+ e i b2 b3 b2 b4 y i X i Male Female 0 y i =b3+b2 X i + e i y i =b4+b2 X i + e i FormalesMi= 1andFi= 0. ForfemalesMi= 0andFi= 1. Malesandfemaleshavedifferent startingsalaries ,b3 > b4 , buttheir salariesincreaseatthesamerate, b2. Back to our basic model: years of experience
• 41. Since under our null hypothesisthe raw score test statistic:has ameanand avariance ,we can standardizeby subtracting the mean (zero)and dividing by the standard deviation(square root of the variance)to get the standardized test statistic:   b 3 – b 4 Var ( b 3 – b 4 ) b 3 – b 4
• 42. To test the null hypothesis: Z  ( b   b   )  0 Var ( b    b   ) ~  ( 0 , 1 )
• 43. If the var iance of the y i ,  2 , is unknown , then Var ( b  3  b  4 ) is also unknown and must be estimated from the exp ression : Est . Var ( b  3  b  4 )  Est . Var ( b  3 )  Est . Var ( b  4 )  2 Est . Cov ( b  3 , b  4 )
• 44. Use thesample varianceas an estimator of thepopulation variance :
• 45. The values for the following expression are obtained in practice from thediagonal andoff-diagonalelements of theestimated variance-covariance matrix : Est . Var ( b  3  b  4 )  Est . Var ( b  3 )  Est . Var ( b  4 )  2 Est . Cov ( b  3 , b  4 )
• 46. y i= b1 + b2 X i+ b3M i b2 (b1 + b3) b2 b1 y i X i Male Female 0 y i =( b1 + b3 ) +b2X i y i =b1 +b2X i Malesandfemaleshavedifferent startingsalaries ,b3 > 0, buttheir salariesincreaseatthesamerate, b2. years of experience Alternative :make women the default group ^ ^ ^
• 47. y i =b1 + b2 X i+ b3 M i+ b4 D i y i =(b1 + b3 + b4)+b2 X i y i =(b1 + b4)+b2 X i y i =(b1 + b3)+b2 X i y i =b1+b2 X i characteristicdummy variables: male college grad: female college grad: male not a grad: female not a grad: ^ ^ ^ ^ ^
• 48. years of experience 0 Xi M-D(male-degree) F-D(female-degree) M-N(male-no degree) F-N(female-no degree) y i wage rate very restrictive assumptiony i =b1 + b2 X i+ b3 M i+ b4 D i b1 b1+b3 b1+b4 b1+b3+b4 very rigid !!! ^
• 49. CreatingComposite Dummy Variables( vs.characteristicdummy variables )
• 50. Job:Gender: Karnaugh map forgendervs. status ofjob :S I M 15 25 40 F 13 27 40 28 52 80 S =supervisor I=individual men : women :
• 51. Occupationvs.Jobvs.Gender Gender: Occupation: Job: C T U S I S I S I M 2 4 3 5 10 16 40 F 1 6 0 7 12 14 40 3 10 3 12 22 30 80 C = Computer T = Other Technical U = Untechnical
• 52. Karnaugh Map forOccupation , JobStatus,Gender , andDegreeStatus: Degree No Degree C T U S I S I S I D M 1 3 2 5 6 13 30 F 0 3 0 6 7 8 24 N M 1 1 1 0 4 3 10 F 1 3 0 1 5 6 16 3 10 3 12 22 30 80
• 53. compositedummy variables: This defines combined ( instead of separate ) general characteristics. y i =b1 + b2 X i+ b3 MN i+ b4 FD i+ b5 MD i years of experience 0 Xi M-D(male-degree) F-D(female-degree) M-N(male-no degree) F-N(female-no degree) y i wage rate b1 b1 + b3 b1 + b4 b1 + b5 ^
• 54. MultipleRegression Analysis value ofresidential property ( buying a home )
• 55. A i= bathroomsX i= sq. ft. living space H 0 :   vs. H 1 :   H 0 :   vs. H 1 :   ˆYi  b  1  b  2 X i  b  3 A i  b  4 A i X i b  3 Est . Var b  3 ˜ t n  4 b  4 Est . Var b  4 ˜ t n  4
• 56. Testing Ho:   H1 :otherwise and SSE R   y i  b 1  b 2 X i  2 i  1 n  SSE U   y i  b 1  b  X i  b  A i  b  A i X i  2 i  1 n 
• 57. Saleof House withBed and Bath Dummies 80000010.000 100000120.000 120010030.000 150010040.000 180010150.000 200010160.000 220001070.000 250001080.000 300001190.000 3500011100.000 PRICE = f ( SQFEET, D2BED, B3BED, A2BATH ) I.II.III.IV.PRICE (thousands) I.SQFEET=square feet of living space II.D2BED=dummy=1 if two-bedroom house III.D3BED=dummy=1 if three-bedroom house IV.A2BATH=dummy=1 if two-bathroom house
• 58. PRICE = f ( SQFEET, D2BED, B3BED, A2BATH ) Saleof House withBed and Bath Dummies ANALYSIS OF VARIANCE SOURCE SUM-OF-SQUARESDFMEAN-SQF-RATIOP REGRESSION8191.94342047.986176.3780.000 RESIDUAL58.057511.611 DURBIN-WATSONDSTATISTIC:2.216 FIRST ORDERAUTOCORRELATIONCOEFF:- 0.153 DEP VAR:PRICEN:10MULTIPLE R: 0.996SQUARED MULTIPLE R: 0.993 ADJUSTED SQUARED MULTIPLE R: 0.987STD ERROR OF ESTIMATE:3.40
• 59. PRICE = f ( SQFEET, D2BED, B3BED, A2BATH ) Saleof House withBed and Bath Dummies DEP VAR:PRICE N:10MULTIPLE R:0.996 SQUARED MULTIPLE R:0.993 ADJUSTED SQUARED MULTIPLE R:0.987 STD ERROR OF ESTIMATE:3.40 VARIABLECOEFFSTD ERRTP(2-TAIL) INTERCEPT - 6.4824.112-1.5760.176SQFEET 0.0210.0053.9580.011 D2BED 14.6624.8713.0100.030 D3BED 29.80310.5752.8180.037 A2BATH 4.8833.9531.2350.272 ( for 1,000 square feet:21 - 6.482 = 14.518or\$14,518 )
• 60. VARIABLECOEFFSTD ERRTP(2-TAIL) INTERCEPT - 6.4824.112-1.5760.176SQFEET 0.0210.0053.9580.011 D2BED 14.6624.8713.0100.030 D3BED 29.80310.5752.8180.037 A2BATH 4.8833.9531.2350.272 for 1,000 square feet:21 - 6.482 = 14.518or\$14,518
• \$14,518
• 4,883
• \$19,401
• \$14,518
• 14,662
• \$29,180
• \$14,518
• 29,803
• \$44,321
add bath and 2 bedrooms: 14,518 + 4,883 + 29,803 = \$49,204 Regression Analysis of Sale of Residential Property
• 61. Sales Value of Residential Property y = sales value of the property (dollars) X = square feet of living space D1 =dummy vble forone bedroomhome D2 =dummy vble fortwo bedroomhome D3 =dummy vble forthree bedroomhome A1 =dummy vble forone bathroomhome A2 =dummy vble fortwo bathroomhome For a one-bedroom, one-bathroom home,such that D2=0, D3=0, and A2=0, we have: y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i ^ y i  b 1  b 2 X i 1 bedroom , 1 bathroom ^
• 62. Sales Value of Residential Property For a 2-bedroom, 1-bathroom home,we haveD2=1, D3=0, and A2=0 ^ ^ y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i y i  ( b 1  b 3 )  b 2 X i 2 bedroom , 1 bathroom
• 63. Sales Value of Residential Property For a 1-bedroom, 2-bathroom home, we haveD2=0, D3=0, and A2=1 ^ ^ y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i y i  ( b 1  b 5 )  b 2 X i 1 bedroom , 2 bathroom
• 64. Sales Value of Residential Property For a 2-bedroom, 2-bathroom home,we haveD2=1, D3=0, and A2=1 y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i ^ y i  ( b 1  b 3  b 5 )  b 2 X i 2 bedroom , 2 bathroom ^ y i  ( b 1  b 4  b 5 )  b 2 X i 3 bedroom , 2 bathroom ^ y i  ( b 1  b 4 )  b 2 X i 3 bedroom , 1 bathroom ^
• 65. square feet of living space 0 Xi House Sales Model withRestrictedIntercepts b   b   b  D2-A2(two bed, two bath) b   b  D2-A1(two bed, one bath) b   b  D1-A2(one bed, two bath) b  D1-A1(one bed,one bath) y i selling price b   b   b  D3-A2(three bed, two bath) b   b  D3-A1(three bed, one bath) b  y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i ^ ^ Rigid !!!
• 66. CreatingComposite Dummy Variables( vs.characteristicdummy variables )
• 67. Bath- rooms How do we createcomposite dummy variables ?Needtoaccountfortheinteraction effectbetween bathroomsand bedrooms.
• 123
• 1682640
• 2772640
• 13155280
Bedrooms
• 68. Composite dummy variables are created for each nonempty cell.Create sixcompositedummy variables: D1A1=1if one bed and one bath,orD1A1= 0 D1A2=1if one bed and two bath,orD1A2= 0 D2A1=1if two bed and one bath,orD2A1= 0 D2A2=1if two bed and two bath,orD2A2= 0 D3A1=1if three bed and one bath, orD3A1= 0 D3A2=1if three bed and two bath, orD3A2= 0
• 69. Sales Value of Residential Property y = sales value of the property (dollars) X = square feet of living space D1 A1= interactionone-bed&one-bath D1 A2= interactionone-bed&two-bath D2 A1= interactiontwo-bed&one-bath D2 A2= interactiontwo-bed&two-bath D3 A1= interactionthree-bed&one-bath D3 A2= interactionthree-bed&two-bath y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i
• 70. This one equation with all these dummy variables actually is representingsix equations .You mustsubstitute in for each of the dummy variablesto generate thesix equationsthat are implied by thisone dummy variable equation. For a one-bedroom, one-bathroom home, SinceD1A1 = 1,while the others are zero: y i  b 1  b 2 X i 1 bedroom , 1 bathroom ^ y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i
• 71. square feet of living space 0 Xi House Sales Model with Unrestricted Intercepts D2-A2(two bed, two bath) D2-A1(two bed, one bath) D1-A2(one bed, two bath) b  D1-A1(one bed,one bath) y i selling price D3-A2(three bed, two bath) D3-A1(three bed, one bath) b 
• 72. one-bedroom ,two-bathroom D1A2 =1, while the others are zero: nowgraphit!=======> y i  ( 1  b 3 )  b 2 X i 1 bedroom , 2 bathroom ^ y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i b
• 73. square feet of living space 0 Xi House Sales Model with Unrestricted Intercepts D2-A2(two bed, two bath) b   b  D2-A1(two bed, one bath) D1-A2(one bed, two bath) b  D1-A1(one bed,one bath) y i selling price D3-A2(three bed, two bath) D3-A1(three bed, one bath)
• 74. two-bedroom ,one-bathroom nowgraphit!=======> y i  ( b 1  b 4 )  b 2 X i 2 bedroom , 1 bathroom ^ y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i D2A1 =1, while the others are zero:
• 75. square feet of living space 0 Xi House Sales Model with Unrestricted Intercepts D2-A2(two bed, two bath) b   b  D2-A1(two bed, one bath) b   b  D1-A2(one bed, two bath) b  D1-A1(one bed,one bath) y i selling price D3-A2(three bed, two bath) D3-A1(three bed, one bath)
• 76. two-bedroom ,two-bathroom nowgraphit!=======> y i  ( b 1  b 5 )  b 2 X i 2 bedroom , 2 bathroom ^ y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i D2A2 =1, while the others are zero:
• 77. square feet of living space 0 Xi House Sales Model with Unrestricted Intercepts b   b  D2-A2(two bed, two bath) b   b  D2-A1(two bed, one bath) b   b  D1-A2(one bed, two bath) b 1 D1-A1(one bed,one bath) y i selling price D3-A2(three bed, two bath) D3-A1(three bed, one bath)
• 78. square feet of living space 0 Xi House Sales Model with Unrestricted Intercepts b   b   D2-A2(two bed, two bath) b   b  D2-A1(two bed, one bath) b   b  D1-A2(one bed, two bath) b 1 D1-A1(one bed,one bath) y i selling price b   b  D3-A2(three bed, two bath) b   b  D3-A1(three bed, one bath)
• 79. CreatingComposite Dummy Variables( vs.characteristicdummy variables )
• 80. Bath- rooms How do we createcomposite dummy variables ?Needtoaccountfortheinteraction effectbetween bathroomsand bedrooms.
• 123
• 1682640
• 2772640
• 13155280
Bedrooms
• 81. Bedroomsvs.Bathsvs.Garage Baths Bedrooms Cars in Garage: 1 2 3 1 2 1 2 1 2 1 2 4 3 5 10 16 40 2 1 6 0 7 12 14 40 3 10 3 12 22 30 80
• 82. Karnaugh Map forBedrooms ,Baths ,Garage , andSchool : Adams Saint Joseph 1 2 3 1 2 1 2 1 2 A 1 1 3 2 5 6 13 30 2 0 3 0 6 7 8 24 J 1 1 1 1 0 4 3 10 2 1 3 0 1 5 6 16 3 10 3 12 22 30 80
• 82
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Description
Lots of neat examples of how to use and interpret dummy variables in regression analysis. Created by Professor Marsh for his introductory statistics course at the University of Notre Dame, Notre Dame, Indiana.
Text
• 1. Dummy Variable Models
• 2. “ Using Dummy Variables in Wage Discrimination Cases” Multiple RegressionSandy:pages 603 - 613 Alsoreadpapertitled:
• 3. Are Male Nurses Discriminated Against? malenurses0 femalenurses Years of experience, X i W f _  4 ^ W m _  3 ^ ~ m W  3 ~ W f ~  4 ~   ~ adjustedforexperiencenotadjustedforexperienceo o o o o o o o o o o o + + + + + + + + + + + + + + + + + + + + + + + + + o o o   ~
• 5. Intercept Dummy Variables Dummy variables are binary (0,1) D t= 1 ifredcar, D t= 0 otherwise. y t = 1 + 2 X t+ 3 D t+e t y t =speed of car in miles per hour X t =age of car in years Police:redcars travel faster . H 0 :  3= 0 H 1 :  3> 0
• 6. y t = 1 + 2 X t+ 3 D t+e t redcars :y t =(  1+ 3 ) + 2 X t+e t other cars :y t = 1+ 2 X t+e t y t X t miles perhour age in years 0  1+ 3  1  2  2 redcars other cars
• 7. Slope Dummy Variables y t = 1 + 2 X t+ 3 D t X t+e t y t = 1+ (  2+ 3 )X t+e t y t = 1+ 2 X t+e t y t X t value of porfolio years 0  2+ 3  1  2 stocks bonds Stock portfolio: D t= 1Bond portfolio: D t= 0  1= initial investment
• 8. Different Intercepts & Slopes y t = 1 + 2 X t+ 3 D t + 4 D t X t+e t y t =(  1+ 3 ) + (  2+ 4 )X t+e t y t = 1+ 2 X t+e t y t X t harvest weight ofcorn rainfall  2+ 4  1  2 “ miracle” regular “ miracle” seed: D t= 1regular seed: D t= 0 1+ 3
• 9. y t= 1+ 2X t+ 3 D t + e t  2  1 + 3  2  1 y t X t Men Women 0 y t = 1+ 2X t + e t Formen  D t = 1. Forwomen  D t= 0. years of experience y t = (  1 + 3 ) + 2X t + e t wage rate . . Testing for discrimination in starting wage H 0 :  3 =0 H 1 :  3 >0
• 10. y t= 1+ 5 X t+ 6 D tX t+ e t  5  5+  6  1 y t X t Men Women 0 y t = 1 + (  5+  6 )X t + e t y t= 1 + 5X t + e t For menD t= 1. For womenD t= 0. Men and women have the samestartingwage, 1, buttheirwage rates increase at differentrates(diff.= 6 ).  6 >  means thatmen’s wage rates are increasingfasterthanwomen's wage rates. years of experience wage rate
• 11. y t= 1+ 2X t+ 3D t+ 4D tX t+ e t  1+ 3  1  2  2+ 4 y t X t Men Women 0 y t= (  1+ 3 ) + (  2+ 4 ) X t+ e t y t= 1+ 2X t+ e t Women are given a higher starting wage, 1 ,while men get the lower starting wage, 1+ 3 , (  3 <0 ).But, men get a faster rate of increase in their wages, 2+ 4 , which is higher than the rate of increase for women, 2, (since 4 >0). yearsofexperience AnIneffectiveAffirmativeActionPlan women are started at a higher wage. Note : (  3 <0) wage rate
• 12. Testing Qualitative Effects
• 1.Test for differences inintercept .
• 2.Test for differences inslope .
• Test for differences in both
• interceptandslope .
• 13. H 0 :   vs  1 :   H 0 :   vs  1 :   Y t   1   2 X t   3 D t   4 D t X t b   3 Est . Var b  3 ˜ t n  4 b    4 Est . Var b  4 ˜ t n  4 men:D t= 1 ;women:D t = 0 Testing for discrimination in starting wage. Testing for discrimination in wage increases. intercept slope  e t
• 14. Why NOW wants one-sided test and Chauvinist Industries wants two-sided.
• 15. Are TwoRegressionsEqual? y t= 1+ 2X t+ 3 D t+ 4 D tX t+ e t variations of “The Chow Test”I.Assuming equal variances (pooling): men:D t= 1 ;women:D t = 0H o : 3= 4= 0vs.H 1 : otherwise y t= wage rate This model assumes equal wage rate variance. X t= years of experience
• 16. Testing  H o :       H 1 :otherwiseand SSE R   y t  b 1  b 2 X t  2 t  1 T  SSE U   y t  b 1  b  X t  b  D t  b  D t X t  2 t  1 T   SSE R  SSE U   2 SSE U   T  4   F T  4  intercept and slope
• 17. y t= 1+ 2X t+ e t II.Allowing for unequal variances: y tm= 1+ 2X tm+ e tm y tw= 1+ 2X tw+ e tw Everyone: Men only: Women only: SSE R Forcing men and women to have same 1 , 2 . Allowing men and women to be different. SSE m SSE w whereSSE U=SSE m+ SSE w F = (SSE R SSE U )/J SSE U/(T  K) J = # restrictions K=unrestricted coefs.(running three regressions) J = 2K = 4
• 18. Polynomial Terms y t= 1+ 2X t+ 3 X 2 t + 4X 3 t + e t Linear in parameters but nonlinear in variables: y t= income;X t= age Polynomial Regression y t X t People retire at different ages or not at all. 90 20 30 40 50 60 80 70
• 19. y t= 1+ 2X t+ 3 X 2 t + 4X 3 t + e t y t= income;X t= age Polynomial Regression Rate income is changing as we age : Slope changes asX tchanges.  y t  X t = 2+ 2 3 X t + 3 4X 2 t
• 20. Continuous Interaction y t= 1+ 2 Z t + 3B t+ 4 Z tB t + e t Exam grade = f(sleep: Z t , study time: B t ) Sleep and study time do not act independently. More study timewill be more effective when combined withmore sleepand less effective when combined withless sleep .
• 21. Your mind sorts things out while you sleep (when you have things to sort out.) y t= 1+ 2 Z t + 3B t+ 4 Z t B t + e t Exam grade = f(sleep: Z t , study time: B t ) Your studying ismore effective with more sleep . continuous interaction y t  B t = 2+ 4Z t  y t  Z t = 2+ 4B t
• 22. y t= 1+ 2 Z t + 3B t+ 4 Z t B t + e t Exam grade = f(sleep: Z t , study time: B t ) IfZ t+B t= 24 hours,thenB t= (24 Z t ) y t= 1 + 2 Z t +  3 (24 Z t ) +  4 Z t(24 Z t ) + e t y t= (  1 + 24  3 ) + (  2   3 + 24 4 ) Z t   4 Z 2 t + e t y t= 1+ 2 Z t + 3Z 2 t + e t Sleep needed to maximize your exam grade : where 2> 0and  3< 0  y t  Z t = 2+ 2  3Z t = 0  2  3 Z t =
• 23. Multicollinearity Correlation among the “ independent” variables. Note: They are independent of the error term, and not of one another.
• 24. Letyirepresenttheith person's wage rate andXirepresent their months of work experience in the equation: yi = b1 + b2 Xi + ei(1) b1 = intercept (starting wage) b2 = increase in the person'swage for each additional monthof work experience.ei = error term with mean zeroand estimated variances2.
• 25. yi=b1 + b2 Xi + b3Mi+ b4Fi+ ei(2) Fi= 1if female Fi = 0 ifmale . Mi= 1ifmaleMi= 0if female .
• 26. yi=b1 + b2 Xi + b3 Mi + b4Fi+ ei(2) Unfortunately this equation contains an underidentified set of parameters (b1, b3, and b4) and cannot be estimated without somerestrictionon the coefficients.
• 27. To see this point, separate out the men'sequation implied by equation (2)from thewomen'sequation. For themen'sequationMi=1 andFi=0.Formen , equation (2) becomes: yi=(b1 + b3) + b2 Xi + ei(3) yi=b1 + b2 Xi + b3Mi+ b4Fi+ ei(2)
• 28. Forwomen ,Mi=0 andFi=1. Forwomen , equation (2) becomes: yi=(b1 + b4) + b2 Xi + ei(4)
• 29. Unfortunately, although we get estimatesof the intercepts (b1 + b3) and (b1 + b4),the value of b1cannot be separatedfrom the values of b3 and b4. Somerestrictionis needed to achieveidentification of b1, b3 and b4.
• 30. One such restriction is b1 = 0. We can drop the original intercept term, b1, sincemenandwomenalreadyhave their own intercept terms,b3andb4 , respectively.
• 31. Underidentification of equation (2) can also be expressed in matrix terms.First, rewrite equation (2) putting the explanatory variables in a row vector multiplied by the corresponding column vector of their respective coefficients: y i    1  X i  M i  F i     2  3  4    i   5  1
• 32. This only represents the ithobservation where i = 1, ..., n. To represent the entire set of n observations at once, we need to&quot;pull the window shade down&quot; as follows: y 1 y 2 M y n  1 X 1 M 1 F 1 1 X 2 M 2 F 2 M M M M 1 X n M n F n  1  2  3  4   1  2 M  n (6)
• 33. Equation (6) presents us with an X matrixwhose first column (the column of ones)is an exact linear combination of the lasttwo columns (the M and F columns).Since Mi is always zero when Fi is equalto one and Mi is always one when Fi isequal to zero, then it always holdsthat Mi + Fi = 1. Therefore, the first column is equal to thesum of the last two columns.
• 34. Since Mi is always zero when Fi is equalto one and Mi is always one when Fi isequal to zero, then it always holdsthat Mi + Fi = 1.1 1 M 1  M 1 M 2 M M n  F 1 F 2 M F n ( 9 )
• 35. Equation (6) and, therefore,equation (2),represent a case of perfectmulticollinearity . This means that a restriction must beintroduced that drops one of these columnsout of the regression. One such restriction isb1 = 0 ,which means dropping the original intercept out of the regression model toprovide the following reduced model: yi=b2 Xi+b3Mi+b4Fi +ei(10) Nowmen and women have separate intercepts and no common intercept is necessary.
• 36. yi = b2 Xi + b3Mi+ b4Fi+ ei b2 b3 b2 b4 yi Xi Male Female 0 yi=b3+b2 Xi+ ei yi=b4+b2 Xi+ ei FormalesMi= 1andFi= 0. ForfemalesMi= 0andFi= 1. Malesandfemaleshavedifferent startingsalaries ,b3 > b4 , buttheir salariesincreaseatthesamerate, b2.
• 37. y i= b2 X i+ b3M i+ b4F i+ e i b2 b3 b2 b4 y i X i Male Female 0 y i =b3+b2 X i + e i y i =b4+b2 X i + e i FormalesMi= 1andFi= 0. ForfemalesMi= 0andFi= 1. Malesandfemaleshavedifferent startingsalaries ,b3 > b4 , buttheir salariesincreaseatthesamerate, b2. years of experience
• 38. y i= b1 + b5M iX i+ b6F iX i+ e i b6 b5 b1 y i X i Male Female 0 y i =b1 +b5 X i + e i y i =b1 +b6 X i + e i For malesMi = 1andFi = 0. For femalesMi = 0andFi = 1. Males and Females have the samestartingsalaryb1, buttheirsalaries increase at differentrates(b5vs.b6). b5 > b6 means thatmen salariesare increasingfasterthanwomen's salaries. years of experience
• 39. y i=b3 M i+ b4 F i+ b5 M iX i+ b6 F iX i+ e i b3 b4 For malesMi = 1andFi = 0. For femalesMi = 0andFi = 1. b6 b5 y i X i Male Female 0 y i =b3+b5 X i + e i y i =b4+b6 X i + e i Females start with a higher starting salary,b4 ,while men get the lower starting salary,b3 . But, men get a faster rate of increase in their salaries,b5 , which is higher than the rate of increase for females,b6 .(b5>b6). yearsofexperience Chauvinist Industries Affirmative Action Plan
• 40. y i= b2 X i+ b3M i+ b4F i+ e i b2 b3 b2 b4 y i X i Male Female 0 y i =b3+b2 X i + e i y i =b4+b2 X i + e i FormalesMi= 1andFi= 0. ForfemalesMi= 0andFi= 1. Malesandfemaleshavedifferent startingsalaries ,b3 > b4 , buttheir salariesincreaseatthesamerate, b2. Back to our basic model: years of experience
• 41. Since under our null hypothesisthe raw score test statistic:has ameanand avariance ,we can standardizeby subtracting the mean (zero)and dividing by the standard deviation(square root of the variance)to get the standardized test statistic:   b 3 – b 4 Var ( b 3 – b 4 ) b 3 – b 4
• 42. To test the null hypothesis: Z  ( b   b   )  0 Var ( b    b   ) ~  ( 0 , 1 )
• 43. If the var iance of the y i ,  2 , is unknown , then Var ( b  3  b  4 ) is also unknown and must be estimated from the exp ression : Est . Var ( b  3  b  4 )  Est . Var ( b  3 )  Est . Var ( b  4 )  2 Est . Cov ( b  3 , b  4 )
• 44. Use thesample varianceas an estimator of thepopulation variance :
• 45. The values for the following expression are obtained in practice from thediagonal andoff-diagonalelements of theestimated variance-covariance matrix : Est . Var ( b  3  b  4 )  Est . Var ( b  3 )  Est . Var ( b  4 )  2 Est . Cov ( b  3 , b  4 )
• 46. y i= b1 + b2 X i+ b3M i b2 (b1 + b3) b2 b1 y i X i Male Female 0 y i =( b1 + b3 ) +b2X i y i =b1 +b2X i Malesandfemaleshavedifferent startingsalaries ,b3 > 0, buttheir salariesincreaseatthesamerate, b2. years of experience Alternative :make women the default group ^ ^ ^
• 47. y i =b1 + b2 X i+ b3 M i+ b4 D i y i =(b1 + b3 + b4)+b2 X i y i =(b1 + b4)+b2 X i y i =(b1 + b3)+b2 X i y i =b1+b2 X i characteristicdummy variables: male college grad: female college grad: male not a grad: female not a grad: ^ ^ ^ ^ ^
• 48. years of experience 0 Xi M-D(male-degree) F-D(female-degree) M-N(male-no degree) F-N(female-no degree) y i wage rate very restrictive assumptiony i =b1 + b2 X i+ b3 M i+ b4 D i b1 b1+b3 b1+b4 b1+b3+b4 very rigid !!! ^
• 49. CreatingComposite Dummy Variables( vs.characteristicdummy variables )
• 50. Job:Gender: Karnaugh map forgendervs. status ofjob :S I M 15 25 40 F 13 27 40 28 52 80 S =supervisor I=individual men : women :
• 51. Occupationvs.Jobvs.Gender Gender: Occupation: Job: C T U S I S I S I M 2 4 3 5 10 16 40 F 1 6 0 7 12 14 40 3 10 3 12 22 30 80 C = Computer T = Other Technical U = Untechnical
• 52. Karnaugh Map forOccupation , JobStatus,Gender , andDegreeStatus: Degree No Degree C T U S I S I S I D M 1 3 2 5 6 13 30 F 0 3 0 6 7 8 24 N M 1 1 1 0 4 3 10 F 1 3 0 1 5 6 16 3 10 3 12 22 30 80
• 53. compositedummy variables: This defines combined ( instead of separate ) general characteristics. y i =b1 + b2 X i+ b3 MN i+ b4 FD i+ b5 MD i years of experience 0 Xi M-D(male-degree) F-D(female-degree) M-N(male-no degree) F-N(female-no degree) y i wage rate b1 b1 + b3 b1 + b4 b1 + b5 ^
• 54. MultipleRegression Analysis value ofresidential property ( buying a home )
• 55. A i= bathroomsX i= sq. ft. living space H 0 :   vs. H 1 :   H 0 :   vs. H 1 :   ˆYi  b  1  b  2 X i  b  3 A i  b  4 A i X i b  3 Est . Var b  3 ˜ t n  4 b  4 Est . Var b  4 ˜ t n  4
• 56. Testing Ho:   H1 :otherwise and SSE R   y i  b 1  b 2 X i  2 i  1 n  SSE U   y i  b 1  b  X i  b  A i  b  A i X i  2 i  1 n 
• 57. Saleof House withBed and Bath Dummies 80000010.000 100000120.000 120010030.000 150010040.000 180010150.000 200010160.000 220001070.000 250001080.000 300001190.000 3500011100.000 PRICE = f ( SQFEET, D2BED, B3BED, A2BATH ) I.II.III.IV.PRICE (thousands) I.SQFEET=square feet of living space II.D2BED=dummy=1 if two-bedroom house III.D3BED=dummy=1 if three-bedroom house IV.A2BATH=dummy=1 if two-bathroom house
• 58. PRICE = f ( SQFEET, D2BED, B3BED, A2BATH ) Saleof House withBed and Bath Dummies ANALYSIS OF VARIANCE SOURCE SUM-OF-SQUARESDFMEAN-SQF-RATIOP REGRESSION8191.94342047.986176.3780.000 RESIDUAL58.057511.611 DURBIN-WATSONDSTATISTIC:2.216 FIRST ORDERAUTOCORRELATIONCOEFF:- 0.153 DEP VAR:PRICEN:10MULTIPLE R: 0.996SQUARED MULTIPLE R: 0.993 ADJUSTED SQUARED MULTIPLE R: 0.987STD ERROR OF ESTIMATE:3.40
• 59. PRICE = f ( SQFEET, D2BED, B3BED, A2BATH ) Saleof House withBed and Bath Dummies DEP VAR:PRICE N:10MULTIPLE R:0.996 SQUARED MULTIPLE R:0.993 ADJUSTED SQUARED MULTIPLE R:0.987 STD ERROR OF ESTIMATE:3.40 VARIABLECOEFFSTD ERRTP(2-TAIL) INTERCEPT - 6.4824.112-1.5760.176SQFEET 0.0210.0053.9580.011 D2BED 14.6624.8713.0100.030 D3BED 29.80310.5752.8180.037 A2BATH 4.8833.9531.2350.272 ( for 1,000 square feet:21 - 6.482 = 14.518or\$14,518 )
• 60. VARIABLECOEFFSTD ERRTP(2-TAIL) INTERCEPT - 6.4824.112-1.5760.176SQFEET 0.0210.0053.9580.011 D2BED 14.6624.8713.0100.030 D3BED 29.80310.5752.8180.037 A2BATH 4.8833.9531.2350.272 for 1,000 square feet:21 - 6.482 = 14.518or\$14,518
• \$14,518
• 4,883
• \$19,401
• \$14,518
• 14,662
• \$29,180
• \$14,518
• 29,803
• \$44,321
add bath and 2 bedrooms: 14,518 + 4,883 + 29,803 = \$49,204 Regression Analysis of Sale of Residential Property
• 61. Sales Value of Residential Property y = sales value of the property (dollars) X = square feet of living space D1 =dummy vble forone bedroomhome D2 =dummy vble fortwo bedroomhome D3 =dummy vble forthree bedroomhome A1 =dummy vble forone bathroomhome A2 =dummy vble fortwo bathroomhome For a one-bedroom, one-bathroom home,such that D2=0, D3=0, and A2=0, we have: y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i ^ y i  b 1  b 2 X i 1 bedroom , 1 bathroom ^
• 62. Sales Value of Residential Property For a 2-bedroom, 1-bathroom home,we haveD2=1, D3=0, and A2=0 ^ ^ y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i y i  ( b 1  b 3 )  b 2 X i 2 bedroom , 1 bathroom
• 63. Sales Value of Residential Property For a 1-bedroom, 2-bathroom home, we haveD2=0, D3=0, and A2=1 ^ ^ y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i y i  ( b 1  b 5 )  b 2 X i 1 bedroom , 2 bathroom
• 64. Sales Value of Residential Property For a 2-bedroom, 2-bathroom home,we haveD2=1, D3=0, and A2=1 y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i ^ y i  ( b 1  b 3  b 5 )  b 2 X i 2 bedroom , 2 bathroom ^ y i  ( b 1  b 4  b 5 )  b 2 X i 3 bedroom , 2 bathroom ^ y i  ( b 1  b 4 )  b 2 X i 3 bedroom , 1 bathroom ^
• 65. square feet of living space 0 Xi House Sales Model withRestrictedIntercepts b   b   b  D2-A2(two bed, two bath) b   b  D2-A1(two bed, one bath) b   b  D1-A2(one bed, two bath) b  D1-A1(one bed,one bath) y i selling price b   b   b  D3-A2(three bed, two bath) b   b  D3-A1(three bed, one bath) b  y i  b 1  b 2 X i  b 3 D 2 i  b 4 D 3 i  b 5 A 2 i ^ ^ Rigid !!!
• 66. CreatingComposite Dummy Variables( vs.characteristicdummy variables )
• 67. Bath- rooms How do we createcomposite dummy variables ?Needtoaccountfortheinteraction effectbetween bathroomsand bedrooms.
• 123
• 1682640
• 2772640
• 13155280
Bedrooms
• 68. Composite dummy variables are created for each nonempty cell.Create sixcompositedummy variables: D1A1=1if one bed and one bath,orD1A1= 0 D1A2=1if one bed and two bath,orD1A2= 0 D2A1=1if two bed and one bath,orD2A1= 0 D2A2=1if two bed and two bath,orD2A2= 0 D3A1=1if three bed and one bath, orD3A1= 0 D3A2=1if three bed and two bath, orD3A2= 0
• 69. Sales Value of Residential Property y = sales value of the property (dollars) X = square feet of living space D1 A1= interactionone-bed&one-bath D1 A2= interactionone-bed&two-bath D2 A1= interactiontwo-bed&one-bath D2 A2= interactiontwo-bed&two-bath D3 A1= interactionthree-bed&one-bath D3 A2= interactionthree-bed&two-bath y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i
• 70. This one equation with all these dummy variables actually is representingsix equations .You mustsubstitute in for each of the dummy variablesto generate thesix equationsthat are implied by thisone dummy variable equation. For a one-bedroom, one-bathroom home, SinceD1A1 = 1,while the others are zero: y i  b 1  b 2 X i 1 bedroom , 1 bathroom ^ y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i
• 71. square feet of living space 0 Xi House Sales Model with Unrestricted Intercepts D2-A2(two bed, two bath) D2-A1(two bed, one bath) D1-A2(one bed, two bath) b  D1-A1(one bed,one bath) y i selling price D3-A2(three bed, two bath) D3-A1(three bed, one bath) b 
• 72. one-bedroom ,two-bathroom D1A2 =1, while the others are zero: nowgraphit!=======> y i  ( 1  b 3 )  b 2 X i 1 bedroom , 2 bathroom ^ y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i b
• 73. square feet of living space 0 Xi House Sales Model with Unrestricted Intercepts D2-A2(two bed, two bath) b   b  D2-A1(two bed, one bath) D1-A2(one bed, two bath) b  D1-A1(one bed,one bath) y i selling price D3-A2(three bed, two bath) D3-A1(three bed, one bath)
• 74. two-bedroom ,one-bathroom nowgraphit!=======> y i  ( b 1  b 4 )  b 2 X i 2 bedroom , 1 bathroom ^ y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i D2A1 =1, while the others are zero:
• 75. square feet of living space 0 Xi House Sales Model with Unrestricted Intercepts D2-A2(two bed, two bath) b   b  D2-A1(two bed, one bath) b   b  D1-A2(one bed, two bath) b  D1-A1(one bed,one bath) y i selling price D3-A2(three bed, two bath) D3-A1(three bed, one bath)
• 76. two-bedroom ,two-bathroom nowgraphit!=======> y i  ( b 1  b 5 )  b 2 X i 2 bedroom , 2 bathroom ^ y i  b 1  b 2 X i  b 3 D1A2 i  b 4 D2A1 i  b 5 D2A2 i ^  b 6 D3A1 i  b 7 D3A2 i D2A2 =1, while the others are zero:
• 77. square feet of living space 0 Xi House Sales Model with Unrestricted Intercepts b   b  D2-A2(two bed, two bath) b   b  D2-A1(two bed, one bath) b   b  D1-A2(one bed, two bath) b 1 D1-A1(one bed,one bath) y i selling price D3-A2(three bed, two bath) D3-A1(three bed, one bath)
• 78. square feet of living space 0 Xi House Sales Model with Unrestricted Intercepts b   b   D2-A2(two bed, two bath) b   b  D2-A1(two bed, one bath) b   b  D1-A2(one bed, two bath) b 1 D1-A1(one bed,one bath) y i selling price b   b  D3-A2(three bed, two bath) b   b  D3-A1(three bed, one bath)
• 79. CreatingComposite Dummy Variables( vs.characteristicdummy variables )
• 80. Bath- rooms How do we createcomposite dummy variables ?Needtoaccountfortheinteraction effectbetween bathroomsand bedrooms.
• 123
• 1682640
• 2772640
• 13155280
Bedrooms
• 81. Bedroomsvs.Bathsvs.Garage Baths Bedrooms Cars in Garage: 1 2 3 1 2 1 2 1 2 1 2 4 3 5 10 16 40 2 1 6 0 7 12 14 40 3 10 3 12 22 30 80
• 82. Karnaugh Map forBedrooms ,Baths ,Garage , andSchool : Adams Saint Joseph 1 2 3 1 2 1 2 1 2 A 1 1 3 2 5 6 13 30 2 0 3 0 6 7 8 24 J 1 1 1 1 0 4 3 10 2 1 3 0 1 5 6 16 3 10 3 12 22 30 80