Video Transcript
Some data sets can be modeled with a nonlinear regression models. The first part of the question asks us which scatterplot shows data that can be best modelled by logistic regression? Is it scatterplot A), B), C), or D)? The second part of the question asks us which formula models the relationship between π₯ and π¦ in such a data set? Is it option A) π¦ equals π over one minus ππ to the ππ₯? Option B) π¦ equals π over one minus ππ to the negative ππ₯? Option C) π¦ equals π over one plus ππ to the ππ₯? Or option D) π¦ equals π over one plus ππ to the negative ππ₯?
Letβs start with the first part of the question, which scatterplot shows data that can be best modelled by logistic regression? For logistic regression to be suitable for a particular data set, that data should fall in a very distinctive shape when plotted on a scatter plot. We can see this very distinctive shape in option D. This shape is sometimes called an S-shape because the data tends to fall on or near a curve which has a somewhat slanted S-shape.
The characteristics of a logistic curve, for which logistic regression is named, are as follows. A logistic curve has a shallow slope which increases before plattering out again. Option D is the only option which displays these characteristics.
We can draw a nice smooth curve through or near the data points for option B. But while this curve starts off shallow and gets steeper as weβd like it to for a regression curve, itβs never plattered out again. And we need the slope to become a shallower and the curve to platter out for logistic regression to be suitable. The data in option B, therefore, isnβt particularly suitable to be modelled by logistic regression.
For options A and C, itβs pretty hard to draw a nice smooth curve through or near the data points. But certainly, we donβt see a shallow slope becoming steeper before plattering out again. Option D is the only one that makes sense.
Now we move on to the second part of our question, which formula models the relationship between π₯ and π¦ in such a data set? All the options have the same form. The question is whether thereβs a plus or a minus after the one in the denominator, and whether there is a minus sign before ππ₯ in the exponent.
Of course, if A and B are allowed to be negative, then all of these formulas are equivalent. We can get formula A from formula B by changing the value of B to its additive inverse, or opposite. We, therefore, assume that both A and B are positive. And that these options are showing us the form of the formula that we expect to see.
To decide which option is correct, it helps to look at the previous part of the question and to extend the curve weβve drawn in both directions. We can see that as π₯ approaches negative β, we want π¦ to decrease, getting closer and closer to zero. π¦ tends to zero as π₯ tends to negative β. Looking at the other end of the curve, we see that as π₯ approaches positive β, we want π¦ to asymptotically approach sum finite value. So, π¦ tends to this finite value π as π₯ approaches β.
We can use these facts to choose between our options. But first we need to clear some room, so we have space to work. Letβs try option A first. Does π¦ equals π over one minus ππ to the ππ₯ tend to zero as π₯ tends to negative β as we would like? Well, for this fraction to tend to zero, as the numerator π is unchanging, the denominator will have to tend to plus or minus β. And does it?
No, it doesnβt. And as π₯ tends negative β, π to the ππ₯ tends to zero. This is because we said that π is greater than zero. Similarly, π times π to the ππ₯ will tend to zero, as well negative ππ to the ππ₯. And so, one minus ππ to the ππ₯ will tend to one, not plus or minus β as we require. We can, therefore, eliminate options A, as in this case, π¦ does not tend to zero as π₯ tends to negative β.
If we replace the minus sign in the denominator with a plus sign, we find that one plus π to the ππ₯ also tends to one, as π₯ tends to negative β. And for this reason, we can also eliminate option C, π¦ equals π over one plus ππ to the ππ₯. As it doesnβt tend to zero as π₯ tends to negative β. Letβs move on to option B.
We ask if option B, π¦ equals π over one minus ππ to the negative ππ₯, tend to zero as π₯ tends to negative β. And for this to happen, the denominator one minus ππ to the negative ππ₯ must tend to either plus or minus β as π₯ tends to negative β. In this case, the answer is yes. As π₯ tends to negative β, negative ππ₯ tends to β. This is because π is greater than zero. And so, each of the negative ππ₯ will tend to β, as will ππ to the negative ππ₯, as π is greater than zero.
However, negative ππ to the negative ππ₯ will tend to negative β. And adding one to this doesnβt do much to change this. One minus ππ to the negative ππ₯ also tends to negative β as π₯ tends to negative β. So, the denominator of our fraction tends to negative β as π₯ tends to negative β. And so, our fraction itself tends to zero, as π₯ tends to negative β as we wish.
But before we get too excited, letβs just stop and think. The denominator of this fraction for π¦ tends to negative β, so itβs negative. And as you imagined that the numerator of our fraction for π¦, which is π, is positive, much like A and B. The value of π¦, which is π over something negative, will be negative. π¦ is negative and approaching zero from below. This isnβt what we see in our diagram. Something strange is going on and you might like to use graphing software to explore what. But it looks like option B is out of the running.
In option D, the minus sign of the one in the denominator becomes a plus sign. And we ask if one plus ππ to the negative ππ₯ tends to plus or minus β. Again, the answer is yes, but plus ππ to the negative ππ₯ tends to plus β. And one plus ππ to the negative ππ₯ does also. And so, one plus ππ to the negative ππ₯ is greater than zero. And it turns out that this is not true for suitably small π₯, itβs true for all π.
This means that π¦ is positive as we want and expect. So, we have no such problem with option D, which is good news because weβve eliminated all the other options. Option D is, therefore, our answer. You might notice that we havenβt checked that π¦ tends to some constant value π as π₯ tends to β. We donβt need to check this, as we already found our answer by a process of elimination.
But if you do check this, then youβll find that π is, in fact, the value π, the numerator of our fraction. This also justifies the previous assumption that π should be positive like π and π. We use this assumption to show that π¦ would be less than zero for suitably small π₯ in option B. This formula π¦ equals π over one plus ππ to the negative ππ₯ for logistic regression is well-worth remembering. As weβve just seen itβs not particularly easy to work this formula out using logic alone.