This happened at work. One of the questions posted by one of our brighter analysts was about Freedman's paradox. We see a lot of work going on in the predictive modeling space across industries where a bunch of variables are being thrown together to get some predictive power on some dependent variable. Sometimes we will get variables which are not related but have some predictive power (hence the paradox). What this made me do was to see what other paradoxes are around in statistics.
A lot of us are familiar with Simpson's paradox. (I will write about this later as this is one of my favorite topics), and Freedman's paradox is related to this one in the sense that we are getting strong relationships when we should expect none.
A bunch of other paradoxes are a lot more interesting possibly due to my relative ignorance of them. Wikipedia has a link to some of these and I will probably do some in-depth research on these once I am at a better place.
Lindley's paradox - Very interesting paradox that gives us how we can reach different conclusions based on the same data and the same hypothesis
False positive paradox - When your infection rate is about 1% and you have a test that is 90% accurate in detecting the infection and you test 20% of the population, then your test would have told about 2% of the population that they have the infection when in reality they do not have it and only 0.02% of the population that they actually have the infection correctly. That means almost 90% of the people that you told have the infection actually do not have the infection!!!
Another link that I did some reading on was also good. There are a lot more paradoxes and the mathematical / probability literature has a lot more of them. It would make fun reading for the analytically curios because you would need the ability to recognize these issues when you see them in your day-to-day work.
A lot of us are familiar with Simpson's paradox. (I will write about this later as this is one of my favorite topics), and Freedman's paradox is related to this one in the sense that we are getting strong relationships when we should expect none.
A bunch of other paradoxes are a lot more interesting possibly due to my relative ignorance of them. Wikipedia has a link to some of these and I will probably do some in-depth research on these once I am at a better place.
Lindley's paradox - Very interesting paradox that gives us how we can reach different conclusions based on the same data and the same hypothesis
False positive paradox - When your infection rate is about 1% and you have a test that is 90% accurate in detecting the infection and you test 20% of the population, then your test would have told about 2% of the population that they have the infection when in reality they do not have it and only 0.02% of the population that they actually have the infection correctly. That means almost 90% of the people that you told have the infection actually do not have the infection!!!
Another link that I did some reading on was also good. There are a lot more paradoxes and the mathematical / probability literature has a lot more of them. It would make fun reading for the analytically curios because you would need the ability to recognize these issues when you see them in your day-to-day work.
No comments:
Post a Comment