The Bayes predictor of the square loss is $Bbb E_P[Ymid X=x]$? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 00:00UTC (8:00pm US/Eastern)What is the derivative of the cross entropy loss when extending an arbitrary predictor for multi class classification?Mean Square Error Minimization Conditioned On Multivariate Normal Random VariablesNotation in the derivative of the hinge loss functionsquare loss function in classificationProving that the Bayes optimal predictor is in fact optimalLet X : Ω → R be a random variable on a probability space that is normally distributed.The Bayes optimal predictor is optimalhinge loss vs. square of hinge loss componentsBayes (optimal) classifier for binary classification with asymmetric loss functionInequality for Log Concave Distributions

ArcGIS Pro Python arcpy.CreatePersonalGDB_management

Did Deadpool rescue all of the X-Force?

I am having problem understanding the behavior of below code in JavaScript

Why are the trig functions versine, haversine, exsecant, etc, rarely used in modern mathematics?

What do you call the main part of a joke?

Is it possible for SQL statements to execute concurrently within a single session in SQL Server?

Selecting user stories during sprint planning

Why wasn't DOSKEY integrated with COMMAND.COM?

Is it a good idea to use CNN to classify 1D signal?

Putting class ranking in CV, but against dept guidelines

Is CEO the "profession" with the most psychopaths?

Hangman Game with C++

Drawing without replacement: why is the order of draw irrelevant?

Time to Settle Down!

SF book about people trapped in a series of worlds they imagine

Can a new player join a group only when a new campaign starts?

How does the math work when buying airline miles?

Amount of permutations on an NxNxN Rubik's Cube

Is there a kind of relay that only consumes power when switching?

How to compare two different files line by line in unix?

Is a ledger board required if the side of my house is wood?

Morning, Afternoon, Night Kanji

Why do early math courses focus on the cross sections of a cone and not on other 3D objects?

Using audio cues to encourage good posture



The Bayes predictor of the square loss is $Bbb E_P[Ymid X=x]$?



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 00:00UTC (8:00pm US/Eastern)What is the derivative of the cross entropy loss when extending an arbitrary predictor for multi class classification?Mean Square Error Minimization Conditioned On Multivariate Normal Random VariablesNotation in the derivative of the hinge loss functionsquare loss function in classificationProving that the Bayes optimal predictor is in fact optimalLet X : Ω → R be a random variable on a probability space that is normally distributed.The Bayes optimal predictor is optimalhinge loss vs. square of hinge loss componentsBayes (optimal) classifier for binary classification with asymmetric loss functionInequality for Log Concave Distributions










0












$begingroup$



Let $(X,Y) in Bbb X times Bbb Y$ be jointly distributed according
to distribution $P$. Let $h: Bbb X rightarrow tilde Bbb Y$,
where $tilde Bbb Y$ is a predicted output. $ $Let $L(h,P) equiv
Bbb E_P[l(Y, h(X))]$
where $l$ is some loss function.



Show that $f = arg min_h L(h,P) = Bbb E_p[Y mid X = x]$ if $l$ is the
square loss function: $l(Y, h(X)) = (y - h(x))^2$




I figured I show this by showing any other $h$ leads to a larger $L(h,P)$ than $Bbb E_P[Ymid X=x]$.



I start with $$Bbb E_P[(y - Bbb E_p[Ymid X=x])^2] le Bbb E_P[(y - h(x))^2]$$



Then expanding we have:



$$Bbb E_P[y^2-2yBbb E_p[Y|X=x] + Bbb E_P[Ymid X=x]^2] le Bbb E_P[y^2 - 2yh(x) + h(x)^2]$$



And simplifying:



$$-2Bbb E_P[y]Bbb E_p[Ymid X=x] + Bbb E_P[Ymid X=x]^2 le -2Bbb E_P[yh(x)] + Bbb E_P[h(x)^2]$$



But from here I'm a little stuck as to how to continue.



Does anyone have any ideas?










share|cite|improve this question











$endgroup$







  • 1




    $begingroup$
    You want to show that the conditional expectation minimises the square loss. You can find discussion of this here: stats.stackexchange.com/questions/71863/….
    $endgroup$
    – Minus One-Twelfth
    Mar 17 at 14:04










  • $begingroup$
    "I start with" You don't start with your desired conclusion. You start with what you know.
    $endgroup$
    – leonbloy
    Mar 25 at 18:34










  • $begingroup$
    The link given by MinusOne-Twelfth has effectively answered the question in details. Is there anything else you'd like to know?
    $endgroup$
    – Saad
    Mar 26 at 1:58















0












$begingroup$



Let $(X,Y) in Bbb X times Bbb Y$ be jointly distributed according
to distribution $P$. Let $h: Bbb X rightarrow tilde Bbb Y$,
where $tilde Bbb Y$ is a predicted output. $ $Let $L(h,P) equiv
Bbb E_P[l(Y, h(X))]$
where $l$ is some loss function.



Show that $f = arg min_h L(h,P) = Bbb E_p[Y mid X = x]$ if $l$ is the
square loss function: $l(Y, h(X)) = (y - h(x))^2$




I figured I show this by showing any other $h$ leads to a larger $L(h,P)$ than $Bbb E_P[Ymid X=x]$.



I start with $$Bbb E_P[(y - Bbb E_p[Ymid X=x])^2] le Bbb E_P[(y - h(x))^2]$$



Then expanding we have:



$$Bbb E_P[y^2-2yBbb E_p[Y|X=x] + Bbb E_P[Ymid X=x]^2] le Bbb E_P[y^2 - 2yh(x) + h(x)^2]$$



And simplifying:



$$-2Bbb E_P[y]Bbb E_p[Ymid X=x] + Bbb E_P[Ymid X=x]^2 le -2Bbb E_P[yh(x)] + Bbb E_P[h(x)^2]$$



But from here I'm a little stuck as to how to continue.



Does anyone have any ideas?










share|cite|improve this question











$endgroup$







  • 1




    $begingroup$
    You want to show that the conditional expectation minimises the square loss. You can find discussion of this here: stats.stackexchange.com/questions/71863/….
    $endgroup$
    – Minus One-Twelfth
    Mar 17 at 14:04










  • $begingroup$
    "I start with" You don't start with your desired conclusion. You start with what you know.
    $endgroup$
    – leonbloy
    Mar 25 at 18:34










  • $begingroup$
    The link given by MinusOne-Twelfth has effectively answered the question in details. Is there anything else you'd like to know?
    $endgroup$
    – Saad
    Mar 26 at 1:58













0












0








0





$begingroup$



Let $(X,Y) in Bbb X times Bbb Y$ be jointly distributed according
to distribution $P$. Let $h: Bbb X rightarrow tilde Bbb Y$,
where $tilde Bbb Y$ is a predicted output. $ $Let $L(h,P) equiv
Bbb E_P[l(Y, h(X))]$
where $l$ is some loss function.



Show that $f = arg min_h L(h,P) = Bbb E_p[Y mid X = x]$ if $l$ is the
square loss function: $l(Y, h(X)) = (y - h(x))^2$




I figured I show this by showing any other $h$ leads to a larger $L(h,P)$ than $Bbb E_P[Ymid X=x]$.



I start with $$Bbb E_P[(y - Bbb E_p[Ymid X=x])^2] le Bbb E_P[(y - h(x))^2]$$



Then expanding we have:



$$Bbb E_P[y^2-2yBbb E_p[Y|X=x] + Bbb E_P[Ymid X=x]^2] le Bbb E_P[y^2 - 2yh(x) + h(x)^2]$$



And simplifying:



$$-2Bbb E_P[y]Bbb E_p[Ymid X=x] + Bbb E_P[Ymid X=x]^2 le -2Bbb E_P[yh(x)] + Bbb E_P[h(x)^2]$$



But from here I'm a little stuck as to how to continue.



Does anyone have any ideas?










share|cite|improve this question











$endgroup$





Let $(X,Y) in Bbb X times Bbb Y$ be jointly distributed according
to distribution $P$. Let $h: Bbb X rightarrow tilde Bbb Y$,
where $tilde Bbb Y$ is a predicted output. $ $Let $L(h,P) equiv
Bbb E_P[l(Y, h(X))]$
where $l$ is some loss function.



Show that $f = arg min_h L(h,P) = Bbb E_p[Y mid X = x]$ if $l$ is the
square loss function: $l(Y, h(X)) = (y - h(x))^2$




I figured I show this by showing any other $h$ leads to a larger $L(h,P)$ than $Bbb E_P[Ymid X=x]$.



I start with $$Bbb E_P[(y - Bbb E_p[Ymid X=x])^2] le Bbb E_P[(y - h(x))^2]$$



Then expanding we have:



$$Bbb E_P[y^2-2yBbb E_p[Y|X=x] + Bbb E_P[Ymid X=x]^2] le Bbb E_P[y^2 - 2yh(x) + h(x)^2]$$



And simplifying:



$$-2Bbb E_P[y]Bbb E_p[Ymid X=x] + Bbb E_P[Ymid X=x]^2 le -2Bbb E_P[yh(x)] + Bbb E_P[h(x)^2]$$



But from here I'm a little stuck as to how to continue.



Does anyone have any ideas?







probability machine-learning






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Mar 17 at 12:39









Bernard

124k742117




124k742117










asked Mar 17 at 12:33









Oliver GOliver G

1,2761634




1,2761634







  • 1




    $begingroup$
    You want to show that the conditional expectation minimises the square loss. You can find discussion of this here: stats.stackexchange.com/questions/71863/….
    $endgroup$
    – Minus One-Twelfth
    Mar 17 at 14:04










  • $begingroup$
    "I start with" You don't start with your desired conclusion. You start with what you know.
    $endgroup$
    – leonbloy
    Mar 25 at 18:34










  • $begingroup$
    The link given by MinusOne-Twelfth has effectively answered the question in details. Is there anything else you'd like to know?
    $endgroup$
    – Saad
    Mar 26 at 1:58












  • 1




    $begingroup$
    You want to show that the conditional expectation minimises the square loss. You can find discussion of this here: stats.stackexchange.com/questions/71863/….
    $endgroup$
    – Minus One-Twelfth
    Mar 17 at 14:04










  • $begingroup$
    "I start with" You don't start with your desired conclusion. You start with what you know.
    $endgroup$
    – leonbloy
    Mar 25 at 18:34










  • $begingroup$
    The link given by MinusOne-Twelfth has effectively answered the question in details. Is there anything else you'd like to know?
    $endgroup$
    – Saad
    Mar 26 at 1:58







1




1




$begingroup$
You want to show that the conditional expectation minimises the square loss. You can find discussion of this here: stats.stackexchange.com/questions/71863/….
$endgroup$
– Minus One-Twelfth
Mar 17 at 14:04




$begingroup$
You want to show that the conditional expectation minimises the square loss. You can find discussion of this here: stats.stackexchange.com/questions/71863/….
$endgroup$
– Minus One-Twelfth
Mar 17 at 14:04












$begingroup$
"I start with" You don't start with your desired conclusion. You start with what you know.
$endgroup$
– leonbloy
Mar 25 at 18:34




$begingroup$
"I start with" You don't start with your desired conclusion. You start with what you know.
$endgroup$
– leonbloy
Mar 25 at 18:34












$begingroup$
The link given by MinusOne-Twelfth has effectively answered the question in details. Is there anything else you'd like to know?
$endgroup$
– Saad
Mar 26 at 1:58




$begingroup$
The link given by MinusOne-Twelfth has effectively answered the question in details. Is there anything else you'd like to know?
$endgroup$
– Saad
Mar 26 at 1:58










0






active

oldest

votes












Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3151486%2fthe-bayes-predictor-of-the-square-loss-is-bbb-e-py-mid-x-x%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Mathematics Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3151486%2fthe-bayes-predictor-of-the-square-loss-is-bbb-e-py-mid-x-x%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Triangular numbers and gcdProving sum of a set is $0 pmod n$ if $n$ is odd, or $fracn2 pmod n$ if $n$ is even?Is greatest common divisor of two numbers really their smallest linear combination?GCD, LCM RelationshipProve a set of nonnegative integers with greatest common divisor 1 and closed under addition has all but finite many nonnegative integers.all pairs of a and b in an equation containing gcdTriangular Numbers Modulo $k$ - Hit All Values?Understanding the Existence and Uniqueness of the GCDGCD and LCM with logical symbolsThe greatest common divisor of two positive integers less than 100 is equal to 3. Their least common multiple is twelve times one of the integers.Suppose that for all integers $x$, $x|a$ and $x|b$ if and only if $x|c$. Then $c = gcd(a,b)$Which is the gcd of 2 numbers which are multiplied and the result is 600000?

Ingelân Ynhâld Etymology | Geografy | Skiednis | Polityk en bestjoer | Ekonomy | Demografy | Kultuer | Klimaat | Sjoch ek | Keppelings om utens | Boarnen, noaten en referinsjes Navigaasjemenuwww.gov.ukOffisjele webside fan it regear fan it Feriene KeninkrykOffisjele webside fan it Britske FerkearsburoNederlânsktalige ynformaasje fan it Britske FerkearsburoOffisjele webside fan English Heritage, de organisaasje dy't him ynset foar it behâld fan it Ingelske kultuergoedYnwennertallen fan alle Britske stêden út 'e folkstelling fan 2011Notes en References, op dizze sideEngland

Հադիս Բովանդակություն Անվանում և նշանակություն | Դասակարգում | Աղբյուրներ | Նավարկման ցանկ