Some Quant lecture or several

00:00 - 00:39 In case of male-female, this is a dummy variable, you would say if a person is a female, then. The value of the dependent variable decreases by 0.30 compared to the reference, which would. Be let's say male and other gender for example. Depending how you quote so something always needs to be a reference. But now we will get into details of this not important.
00:39 - 01:46 So last week i also briefly show you in the end that anova that you did with bodo is in principle ols, so it's the same thing. So you have the regression model that has a general expression of... So you have a function, right? In a general form and then we do linear function. That we estimate with ols model and then you get something like this. At the end you have an error third, the left over. Okay.
01:47 - 01:58 So you have intercept and then you have slope. Intercept is the value of the dependent. Variable where when all x's. Are equal to zero. Slow shows you the change.
01:58 - 02:29 To the value of the dependent. Variable. For the given value of x. Right. So you can calculate predicted value of the dependent. Variable once you estimate for any phase. This being said, x can be metric and can also be categorical. So categorical is solved with dummy code.
02:29 - 02:51 So categories are not numbers by nature, so we need to make them numerical. Okay, so you can have, i don't know, different colors and then i can classify based on that. Green, blue, white, or whatever. So these are categories, these are not numbers. But by now, we're coding categories.
02:51 - 03:19 Are given numerical expressions, And then they can be used in statistical analysis, right? So if you have a simple variable whether somebody passed a bonus assignment, right? 1-0 yes or no. You can estimate a beta for a person who passed the bonus assignment equals 14.
03:19 - 04:14 So if you pass the bonus assignment, your predicted grade increases by 14 units compared to a person who failed the bonus assignment, so who had 0 here. This is how we interpret that. You can say that one additional hours of studying increases the predicted rate by the value of the beta coefficient which would be 20.77 points. Again this is the example to show how ols and 1-way are non-loc connect, or how they are actually one of the same. So we have three groups, control, treatment one, treatment two, okay?
04:15 - 04:25 Then we have some product liking, So let's say you want to investigate. How different marketing message influences. The perception of the product. So control doesn't receive any message.
04:25 - 04:44 Then you give message a, then you give message b, Then you ask them how much do you like this product. On the scale of one to 100. You get some numbers here. You can dummy code this group.
04:51 - 05:24 By creating dummy columns for each category you create a column. So if a person is in the control group it has 1 otherwise 0. The same for treatment 1. So if a person is in treatment 1, these two people, they get 1s, otherwise 0s. If they are in treatment 2, for treatment 2 they get 1s, otherwise they get 0s. So this would be a dumb coding. One category always needs to be a reference point.
05:24 - 05:42 This means that you are comparing relative to that category. Which one is a reference point doesn't matter. So you select that. Here naturally you would want to control group to be your reference point. So you want to control group to be the reference point.
05:42 - 06:13 Because then when you interpret the results you can say. If a person was in treatment one the effect on liking was some better for treatment one. Compared to the control. If you put treatment 1 as the reference then comparison would be different. You would say if a person was in the control group the effect of liking was better control.
06:13 - 06:46 Compared with the person who was in treatment 1. You'll see how that works. Normally you want something to be a reference that makes sense, that is a baseline. If you have an experimental design, where you have a control and some treatments, then you logically prefer to compare treatment with a control every time. So i wrote this here, so this is a general expression of an ols model.
06:46 - 07:17 If you want to run a model with this simple data set with only 6 observations, you can write it like this. So you have theta0, beta1 treatment1, beta2 treatment2. Is this clear? Why? There is also error but we don't need to worry about it at the moment. So what beta 0 captures here is the average product liking in the control group.
07:18 - 08:51 This is your reference point. And then this beta 1 and beta 2 show you what is the difference in average lighting. Between treatment 1 and the control and treatment 2 and the control. So if you imagine that you have three groups, which have bit mean, right? You're comparing these to the needs and you want to see whether they differ. So you want to see whether treatment increase the lighting. This is liking right? What is this? Y axis.
08:51 - 09:27 If you have probability density function. Stomative. Gereen, variation? No. Probability density function, bell-shaped curve, what do we have on y-axis? Probability density function, some of these three words yes it's from ability right so you have a percentage of your.
09:27 - 10:05 Sample who has observed like the ability to observe this value is, i don't know, 20, then the probability here is maybe, i don't know, whatever, 5%, 10% or something. So the highest density is around the mean. So most people have average lighting. And then as you move further away from the mean, then the probability is lower, that you observe. So you need to, probability density function.
10:05 - 11:06 Cumulative density function, you need to be able to read these things for your own. Also, knowledge, let's say. 26, i can't believe it. It's like, sort of the dependent variable. What about box pots? Do you understand how to read box pots? You basically just take these and you invert them like this. This will be box spots. You have, in this case lighting is here right, then you have control group 1 and group two.
11:11 - 11:46 Then if this one has the lowest lighting so the box spot will be somewhere here right then this one has the highest then its box spot. Will be somewhere here. Okay and this one is in the middle somewhere. So box plus probability density functions, cumulative density functions, you need to. Be able to understand these concepts and you know how to read them.
11:46 - 12:26 The point i wanted to make here is that. Intercept captures the average lighting in the control room. Beta 1 would be exactly this difference between the average lighting in the treatment group. And the average light in the control group. Beta 2 would be this difference between the. Average lighting in the treatment tool and the control group.
12:30 - 13:23 So you generate descriptive statistics for each group. Control, treatment one, treatment. Two. And you can read out average lighting. So in the control it's 37.5 close to 60 close to 75. Okay then you run an ols model here. Intercept is the average lightning in the control group. Then better for treatment one is the difference between this one and this one, which is 22. So the conclusion here is, or the interpretation, if you're exposed to treatment one, your liking. Increases by 22 points compared to a person in the control group.
13:23 - 13:41 Who wants to interpret this one? Treatment 2. If you are in treatment 2, you are more likely to increase your... You are not more likely, we are not talking about the likelihood yet. But if liking increases...
13:41 - 13:57 Liking increases on 33 user to screen, compared to the controller. Yeah, 37, it doesn't matter. So you're on the continuous scale of liking. You're not in the probability world yet. This comes with logistic regression.
13:57 - 14:28 But you're basically sliding on this scale. From this mean to this mean. So this is kind of how you can do anova by. Using ols. But you can also calculate simply by anova. And then you need to do this, i'm. Not sure if you talked about that, this post hoc estimation to see, because what do you see from this anova? Is there somebody who knows how to interpret this?
14:32 - 14:44 This is one way and all, Think about the lecture from bodo, Comparing means between different groups. So it's significant, right? Do you know what is nu new hypothesis in anova?
14:51 - 15:15 New hypothesis, so anova you use when you have more than two groups. If you have two groups then you can use t-test. Three groups you can use anova. Null hypothesis in anova is that all these means are equal. So mean control equals mean treatment one, equals mean treatment two, equals mean treatment three and so on.
15:15 - 15:29 So you're testing the null hypothesis that all means are equal. This is the null hypothesis? Do you reject or accept? Reject. Reject.
15:29 - 15:47 Reject. So if p is low, you reject null hypothesis. So here you're rejecting the null hypothesis that two means are equal. Means that at least one mean is different than the other. So there is some effect. We don't know where the effect lies.
15:48 - 15:59 So to see where the effect lies, If you just run this one way and all the way, You would need to do post hoc estimation, That does pairwise comparison, So pairwise comparison.
15:59 - 16:11 This one with this one, This one with this one, This one with this one. Then you see, what is different than the other. So you can see which group differs.
16:11 - 16:24 Is it control group one, control group two, Group one, group two? In the regression output you can already more or less see. So they're both significant, and the treatment two is more effective.
16:24 - 17:01 Than treatment 1. This doesn't happen to you. Is this more or less clear to you now? For it was already clear and it was not answered. Do you have any questions? Yeah. If inter-set is insignificant, how we can comment it?
17:02 - 17:17 We usually don't interpret inter-set. Okay. Yeah. Because there is no... Uh... Usually don't interpret intercepts. Because there is no. Intercept is an artifact of the estimation procedure. So it needs to go through something.
17:17 - 17:30 But there is uncertainty where it goes through. But this information doesn't tell you anything. Because you care not about the intercept. But you care about these vectors. That are associated with some predictors.
17:31 - 17:40 And then. If you standardize. We talk about standardization. Then the intercept goes through zero. Always. Because you kind of.
17:40 - 17:50 Rescale everything and then. It always goes through zero. So intercept doesn't give you any information. You can rescale everything and then it always goes through zero. So intercepts, yeah, it's not, it doesn't give you any information.
17:50 - 18:50 Do you have any other questions? Okay. So this is what we talk about. Why do we need dummies? What are they? What are the types? And how you interpret them? There are three types of dummies. Okay. Why do you need dummies?
18:52 - 19:12 We talked about that already. To work with non-numerical? Yeah, to work with non-numerical, more precisely categorical data. Okay. More precisely categorical data. Okay? I have a question for you.
19:15 - 19:35 If you do a survey and you have education, A level of education, That goes from, let's say, primary school to phd. And then you have master, bachelor, maybe something in between, depending on the country.
19:35 - 19:46 What kind of data is this? It's categorical, right? But if you measure it in years, is that it is... In years in school? In total. It's category, right? But if you measure it in years, then it is... In years in school?
19:46 - 19:57 In total. If you would like to measure it in all... In years, then it is... So years of school, what if you study bachelor's, what's six years? You're still bachelor, but this is...
Full text made by Jill White Voice Notes Assistant Link: https://t.me/JillWhite_voice_notes_bot?do=open_link Telegram: @JillWhite_voice_notes_bot
00:00 - 00:09 To six years. You're still a bachelor, but it's kind of more. Than school, right? You could do, i mean, yeah, you could do that, but it's not.
00:09 - 00:18 Yeah, it can be an issue. So you usually ask people what is the. Highest level of education. You achieved, and then you have. Categorical data.
00:19 - 00:30 But this data is also ordinal, Right? It goes from lowest. To highest. So if it's ordinal, it can also be considered. As quasi-metric data.
00:32 - 01:06 So you could model it as metric. So you can say one level increase in education. Does that to something. Or you could say, if here it's bachelors, here is below and here it's above. You could. Select people into two categories. You can categorize this data and then say people who attended.
01:06 - 01:32 University no matter what degree they got from the end of the university and. People who did not work so you could do that as well or you can treat it as. Categorical data and then create dummies. Okay. So depending on how many categories you have, you would have. Number of categories.
01:32 - 01:51 Less than one dummies. What do you mean by this? By what? You can create dummies for each category less than... Because one needs to be a reference, right?
01:51 - 02:03 Ah, i see. So you always have... That's also in the slides, underlined a bit later, k-1 dummies. Because one is a reference.
02:03 - 02:27 If you... I don't know think about gender and. Then you say female and then male and other these are two categories and then. One is the reference so it's a binomial dummy but one is still needs to be a. Reference if you have multiple categories then one still needs to be a reference. If we have multiple categories then one always.
02:27 - 03:17 Needs to be a reference. So this is stated here. So for example if you have a farmer and then you can say it's organic or conventional, then we have two categories, right? And you can say organic, farm yes, no, one zero, or for a level of education you can say high school, bachelor, master, so three categories, it's multiduminal, one would still need to be reference. You can decide which one. If you try to put all three dummies in the regression, then it doesn't estimate, so you get an error.
03:17 - 03:46 Because of multicolinearity issues. We will talk about multicolineity that falls under assumptions. But one needs to be the reference and then you compare relative to that one. So this slide kind of summarizes the fact that you always need to take one category. And use it as the reference. Here is an example of dummy posing. So if you have.
03:52 - 04:23 High school, bachelor, master, so you can omit high school and simply not hold that one. Then you can have bachelor and master. So x2 corresponds to if a person has a bachelor degree, x3 if a person has a master degree. And then you would always compare what happens. With the value of the dependent variable.
04:23 - 04:46 When a person holds a bachelor degree. Compared to high school? What happens when a person holds a master degree compared to high school degree? What happens with the value of the dependent variable? Is this clear, this idea?
04:46 - 05:19 We would practice that tomorrow. Three types of dummy variables, intercept slope and slope intercept dummies. This is so called moderation analysis. So this is a new concept maybe for you. But slope intercept dummy this is also so called moderation analysis. I will explain that in a second.
05:19 - 05:59 We start with the most simple, this would be the intercept dummy. So we are going back to this boston housing example and we are working with a categorical variable. It should be here, if a house is in a good neighborhood or bad neighborhood. So it's some kind of classification of neighborhood into good and bad. 0 and 1. So how do you specify the model if a house is in a good neighborhood.
05:59 - 06:35 This automatically takes the value of 1 and then this whole expression equals beta 2. If a house is in a bad neighborhood, this takes a value of 0 and it doesn't matter what beta 2 is. Because beta 2 captures the effect of the house being in a good neighborhood. If a house is in a bad neighborhood, this whole thing doesn't exist, as you can see here. If you want to write, so this is a simple regression.
06:38 - 07:34 But we are now able to show, so we have two axes. We have house size, that is metric, that is here. And then we have dummy for the neighborhood that or good we assume that the bigger the house the higher the house price line in beta1, but then beta2 actually you add that because this is a constant. So you can summarize beta0 plus beta2 and then automatically that's an intercept. Shifter. So intercept shifts either up in this case or it can also shift down. So this captures the effect of a neighborhood being good or bad.
07:37 - 08:06 So if you have a house of zero square meters, which doesn't make sense, In a bad neighborhood, this would be the price. Let's say this is 5,000 euros. If you have a house of 0 square. Meters, in a simple way, so that you're just walking down the y-axis, then you go a bit up. So your starting point is higher. And then on top of that there is an effect of size which is the slope. Is this confusing?
08:06 - 08:51 I don't think. I can take a minute to see you. That's on the graph and then we'll... We assume the intensity is constant. And the effect of this dummy is also kind of a constant. Because you can ask yourself if a house has 50 square meters, let's say this is here, and it's in a bad neighborhood, this would be the price.
08:51 - 09:24 More or less, right? But if you have the same house in a good neighborhood, the price is higher, for the value of beta 2, this distance is constant, so this is the effect of the neighborhood. This distance between two lines is the effect of the neighborhood, it is constant, right? And then the slope, or like, what's the slope, right? Of this line is the effect of the house size.
09:27 - 09:46 So dummy predictor is the effect of the neighborhood. Hmm? Dummy predictor is the effect of the neighborhood. Dummy predictor is the effect of the neighborhood, yes. And it's called intercept dummy because you can see here it's beta zero plus beta two.
09:46 - 10:08 So intercept shifts up or down depending on the value. Of dummy beta coefficient. This is how you can show this model. If you want to, this is called conceptual model, if you want to kind of illustrate how your model looks like.
10:11 - 11:00 So we have size, metric, and a good neighbor that is dummy, affect price. Okay. Okay this will be the slope now you write the equation differently. So you still have beta 0 as an intercept. Then you have the effect of size. Beta 1 captures the effect of size. But beta 2 is the interaction term between the size of the house and the neighbor. Do you know what is an interaction term?
11:04 - 11:41 No? Yes. Yes. What can you tell us? I mean it's like when two conditions are happening at the same time you capture that with one. Yeah, so you basically multiply two columns in the middle of the set. You're multiplying the column that shows the size of the house with the column that shows. Whether a house is in a good neighborhood or in a bad neighborhood. In practice.
11:41 - 13:12 This looks like. So you have size, you have neighborhood, so size can be 50, 75, 90, 40, whatever. Then you will have interaction between these two, so you just multiply. 50 times 0 is 0. 0, 90, 40. And you're getting the interaction term. So what happens here is slope changes, okay? Changes okay so this will be the regression line for a house depending so the regression line shows the house price, the predicted house price, depending on the. Size for a house that is in a bad neighborhood.
13:12 - 14:07 So this multiplication is zero if a house is in a bad neighborhood. So this whole term is zero. If a house is in a good neighborhood you're adding better one and better two. Because they are both multiplied by size so you can write write it like that you can make this. Expression like this so there is an additional the assumption is that the. Bigger the house when it's in a good neighborhood the price should increase faster. So you're testing a bit different assumption with this type of model.
14:10 - 14:45 So the hypothesis is formulated differently. So the house size has a stronger impact on the price in a good neighborhood. This is what you're testing. Previously we would be testing. How neighborhood and. House size separately if was the house price and the last one this is the most this is like a slope dummy, right? Where we have only the interaction.
14:48 - 15:26 And slope intercept dummy, i just want to show you the graph. So you're looking at individual effects, but also joint effect with the interaction. So the questions you can answer is what is the effect of size on house price? What is the effect of neighborhood on house price? And whether house is located in the good neighborhood. Or is there an interaction between the neighborhood and house size on the price. This would be kind of this line.
15:27 - 15:37 It's the interaction. So you can isolate the effect of the neighborhood. Of the size. Like, really, i'm very. Troubled for me to be able to.
15:37 - 15:52 Really understand. For example, houses costing like 10, whatever, things like it. And you can say, okay, 5% of those 10 is explained by the size. And 3% is explained because it's a good neighborhood.
15:53 - 16:05 Like given the coefficients that we have. Like imagine a coefficient. Yeah. If we have a coefficient and given the size and the location, 5 of those 10 are explained by the size and the location five of those ten are explained by the.
16:05 - 16:28 Size and three of those ten are explained by the neighborhood so the two that is. Left is explained by the interaction like the reinforcement between being in a. Good neighborhood and being great size. You're now talking about percentage of variance explained moreover, right? Yeah. But we're not necessarily looking at that first.
16:28 - 16:53 So you can compare different models and see whether they fit better. So you can run slope-intercept slope-intercept down and then see what fits better. But we're talking about the relationship between variables here. So you're assuming there is a, with this interaction, you're assuming there is a with this interaction you're. Assuming there is a joint impact you cannot see if this is five percent i mean you could see if you.
16:53 - 17:41 Run the model step yes exactly if you run the model completely yeah this that's called hierarchical. Regression then you can see how adjusted our spare changes and which one fits better. You can do that, it's not the point of this lecture necessarily, here the point would. Be to understand slope, intercept slope, intercept w, what they mean and what they do. Right? With this interaction you're assuming that also bigger houses in good neighborhoods. Cost more after controlling for the neighborhood and after controlling for the size, initial.
17:41 - 18:07 Size. So there is an interaction, something that works together in some way. I will show you again a little bit later, a different example of what interaction is. Or what moderation and mediation are. I just want to show you this graph now, see what's happening here. How you write the re-graph. Equation. So you have intercept, then you have beta for size, which is the slope of the line.
18:07 - 18:39 Then you have beta for the neighborhood, which is the intercept shifter. This shows you the effect. Of the neighborhood irrespective of the house size. Because we are here at zero, so it doesn't matter. What the house size is, it's the effect of the neighborhood. And this change in soul, you see the slope here is this, the slope here is this. This is a cumulative effect or joint interaction effect between sides and the neighborhood.
18:39 - 19:29 Okay? And then you have a regression line for a house in a bad neighborhood this is zero this is. Zero so these two things together are zero so you only have the size effect this is the regression. Line for a house in a good neighborhood price is a function of size and the neighborhood. So this is 1, this is 1, so beta 3 and beta 2 survive in a way. This two of them together and here beta 1 and beta 3.
19:29 - 19:52 So this is the effect of size and then beta 3 is the effect of size in a good neighborhood. You can sum these two together and then you can see you can write the function for this line here. Regression equation, however you want to say.
Full text made by Jill White Voice Notes Assistant Link: https://t.me/JillWhite_voice_notes_bot?do=open_link Telegram: @JillWhite_voice_notes_bot
00:00 - 00:20 This line here. Regression equation, however you want to say. But let's go to this moderation, mediation, so you understand. Completely what that means. What you saw here.
00:21 - 00:42 This is a moderation analysis. Okay? And mediation is similar but we don't explain that. This is moderation. So you're looking at the effect of x on y, the effect of v on y, and then you're looking.
00:42 - 01:57 At the interaction x times v on y. The assumption here is that the moderator is independent of x. So x doesn't influence the value of the moderating variable. So an example can be is the relationship between this is bad cholesterol cholesterol and the amount. Of exercise performed per week different for normal weight and obese participants. So you're looking at the effect of weekly exercise on the bad cholesterol, you're looking at the effect of weight, whether a person is obese or has a normal body mass index, on the bad cholesterol and then you're looking at the interaction. So exercise for obese people. Because you're multiplying. Exercise for obese people what happens with their cholesterol.
01:57 - 02:45 And you might want to test the hypothesis that this interaction effect has a positive effect. This would mean that when an obese person exercises, their cholesterol drops faster than from another obese person. Do you have another example for moderation that comes to mind, something, some relationship that you could have this interaction effect. Where something reinforces something else. Here the assumption is that the effect of exercise on cholesterol.
02:45 - 03:00 Which we assume that exercise will decrease, Is kind of moderated by this way. Maybe this is not the best example. Because we cannot exclude that weekly exercise. This is me.
03:00 - 03:46 Right? So this path might still exist, at least in the long run. If you do such a study, just an example, maybe i can change it to something better. What else comes to your mind where you might have moderation? Moderation. Maybe something about agricultural production. Or whatever.
03:46 - 04:01 Maybe something. Was pesticides use. And how they. The amount. Of them i don't know.
04:01 - 04:12 Effects. Yield and. I don't know, effects yield and i don't know. Is everybody comfortable with pesticides? Are you like...
04:12 - 04:34 So you have pesticides? I think kilograms, look at that. In kilograms. Here we have yield. It can also be in kilograms. And then a moderating. Variable could be what?
04:38 - 04:52 Quality of land. Quality of pesticides. Land, he said land. Land, quality of land land, he said land land quality of land. Land quality could be.
04:52 - 05:24 Fertilizer maybe. As an independent variable. And pesticide as more. That can also work. So you apply and then here we can have pesticides. Moderator, here it's metric, everything is metric.
05:24 - 05:59 This is not a problem, that can also be the case. But it could also be dummy, whether farmer applied or didn't apply certain pesticide. So that can be a dummy variable, for example. Let's maybe stick with the dummy here. So, yes or no. What would be the hypothesis you could test here? What would be this first?
05:59 - 06:14 So here you would assume what? Fertilizers on yield, positive or negative. Positive. So you want to test whether. Amount of fertilizers applied increases yield.
06:14 - 06:31 Controlling for the amount of pesticides applied, right? Then here pesticides applied or not on yield, positive or negative. Positive. Positive. If you apply agrochemical measures, you get more yield.
06:32 - 06:54 What about the interaction? If you use fertilizer and pesticides, how it affects the yield? Positively or negatively? Yeah, what do you think? Whether this interaction is positive?
06:55 - 07:46 So the question you're asking, What is the effect on yield of the amount of fertilizer applied. Fertilizer is something continuous. For farmers, i don't know if you can see this, maybe not. For farmers who did not apply pesticides, right, compared to the farmers who applied pesticides. And then there is also a slope shifter, but this part is the intercept shifter. So maybe you will have something similar to what you have on the slide.
07:46 - 08:49 So this would be yield, this would be pesticide or fertilizer sorry. If you don't apply pesticides but you do increase the amount of fertilizers applied on your field, the yield will be increased. If you apply pesticides, so if this is yes, yield automatically increases by the difference between beta 1 and beta 0. This is the effect of pesticides, it's here. And then this slope change shows you the effect. Of fertilizers on a field where you also apply pesticides. So the yield increases yield faster.
08:53 - 09:12 Do you have an idea of an example where this relationship would be inverse? Where you would have negative. Where you would have a graph that looks like this. Where you initially start high, with beta zero, but then you shift downwards.
09:14 - 09:20 And the slope may be less pronounced. So think about the model.
Full text made by Jill White Voice Notes Assistant Link: https://t.me/JillWhite_voice_notes_bot?do=open_link Telegram: @JillWhite_voice_notes_bot
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.