Variational distance of product of distributionsNormal sample: alternative proof of $(n-1)S^2|bar X sim chi^2(n-1)$Likelihood Function for the Uniform Density.Prove that the random variables $X_1,cdots,X_n$ are independentProduct of Independent Random Variables - Conditional ExpectationDo we not have that $X_1, ldots, X_n sim F$ iff $X_1, ldots, X_n sim f$?Sufficient statistic of $f(theta;x)=theta x^-2I_[theta,infty)(x)$Divergence between two PDFs - Upper boundPDE approaches to calculate the Earth Movers Distance?Test paramater of density using a ratio testCompute probability that mean is above a given value given samples
Have astronauts in space suits ever taken selfies? If so, how?
Do VLANs within a subnet need to have their own subnet for router on a stick?
Why are electrically insulating heatsinks so rare? Is it just cost?
Adding span tags within wp_list_pages list items
Python: next in for loop
How did the USSR manage to innovate in an environment characterized by government censorship and high bureaucracy?
Expeditious Retreat
How do we improve the relationship with a client software team that performs poorly and is becoming less collaborative?
Is a tag line useful on a cover?
The use of multiple foreign keys on same column in SQL Server
Modeling an IPv4 Address
How much RAM could one put in a typical 80386 setup?
Which models of the Boeing 737 are still in production?
How could an uplifted falcon's brain work?
Did Shadowfax go to Valinor?
How is the claim "I am in New York only if I am in America" the same as "If I am in New York, then I am in America?
I’m planning on buying a laser printer but concerned about the life cycle of toner in the machine
To string or not to string
How is it possible to have an ability score that is less than 3?
What are the differences between the usage of 'it' and 'they'?
"to be prejudice towards/against someone" vs "to be prejudiced against/towards someone"
Is this a crack on the carbon frame?
Can an x86 CPU running in real mode be considered to be basically an 8086 CPU?
How does one intimidate enemies without having the capacity for violence?
Variational distance of product of distributions
Normal sample: alternative proof of $(n-1)S^2|bar X sim chi^2(n-1)$Likelihood Function for the Uniform Density.Prove that the random variables $X_1,cdots,X_n$ are independentProduct of Independent Random Variables - Conditional ExpectationDo we not have that $X_1, ldots, X_n sim F$ iff $X_1, ldots, X_n sim f$?Sufficient statistic of $f(theta;x)=theta x^-2I_[theta,infty)(x)$Divergence between two PDFs - Upper boundPDE approaches to calculate the Earth Movers Distance?Test paramater of density using a ratio testCompute probability that mean is above a given value given samples
$begingroup$
Let $F(barx)=prod_i=1^nf(x_i)$ and $G(barx)=prod_i=1^ng(x_i)$, where $f(x)$ and $g(x)$ are probability density functions, and $barx=(x_1,ldots,x_n)$. The variational distance between $F$ and $G$ is:
$$V(F,G)=int |F(barx)-G(barx)|dbarx$$
Can we write it in terms of the variational distance between $f$ and $g$?
I know we could do such if it was KL divergence:
$$D_KL(F,G)=n D_KL(f,g).$$ Do we have such a simplification for variational distance as well?
probability probability-theory statistics information-theory total-variation
$endgroup$
add a comment |
$begingroup$
Let $F(barx)=prod_i=1^nf(x_i)$ and $G(barx)=prod_i=1^ng(x_i)$, where $f(x)$ and $g(x)$ are probability density functions, and $barx=(x_1,ldots,x_n)$. The variational distance between $F$ and $G$ is:
$$V(F,G)=int |F(barx)-G(barx)|dbarx$$
Can we write it in terms of the variational distance between $f$ and $g$?
I know we could do such if it was KL divergence:
$$D_KL(F,G)=n D_KL(f,g).$$ Do we have such a simplification for variational distance as well?
probability probability-theory statistics information-theory total-variation
$endgroup$
5
$begingroup$
No, we don't. The TV distance simply doesn't tensorize well. A saving grace, however, is the fact that TV is easily bounded by many $f$-divergences (e.g., Pinsker's inequality says $V le sqrt2 D_KL.$), which is many times sufficient to solve the problem at hand. Alternately one has the choice to 'metrise' the problems using some other divergence. KL, $chi^2,$ Jensen-Shannon and Hellinger are popular choices, with the last two both being symmetric in their arguments, and being 'close to' actual norms, if that is important.
$endgroup$
– stochasticboy321
Sep 14 '18 at 2:49
add a comment |
$begingroup$
Let $F(barx)=prod_i=1^nf(x_i)$ and $G(barx)=prod_i=1^ng(x_i)$, where $f(x)$ and $g(x)$ are probability density functions, and $barx=(x_1,ldots,x_n)$. The variational distance between $F$ and $G$ is:
$$V(F,G)=int |F(barx)-G(barx)|dbarx$$
Can we write it in terms of the variational distance between $f$ and $g$?
I know we could do such if it was KL divergence:
$$D_KL(F,G)=n D_KL(f,g).$$ Do we have such a simplification for variational distance as well?
probability probability-theory statistics information-theory total-variation
$endgroup$
Let $F(barx)=prod_i=1^nf(x_i)$ and $G(barx)=prod_i=1^ng(x_i)$, where $f(x)$ and $g(x)$ are probability density functions, and $barx=(x_1,ldots,x_n)$. The variational distance between $F$ and $G$ is:
$$V(F,G)=int |F(barx)-G(barx)|dbarx$$
Can we write it in terms of the variational distance between $f$ and $g$?
I know we could do such if it was KL divergence:
$$D_KL(F,G)=n D_KL(f,g).$$ Do we have such a simplification for variational distance as well?
probability probability-theory statistics information-theory total-variation
probability probability-theory statistics information-theory total-variation
asked Sep 13 '18 at 17:38
AlbertAlbert
578
578
5
$begingroup$
No, we don't. The TV distance simply doesn't tensorize well. A saving grace, however, is the fact that TV is easily bounded by many $f$-divergences (e.g., Pinsker's inequality says $V le sqrt2 D_KL.$), which is many times sufficient to solve the problem at hand. Alternately one has the choice to 'metrise' the problems using some other divergence. KL, $chi^2,$ Jensen-Shannon and Hellinger are popular choices, with the last two both being symmetric in their arguments, and being 'close to' actual norms, if that is important.
$endgroup$
– stochasticboy321
Sep 14 '18 at 2:49
add a comment |
5
$begingroup$
No, we don't. The TV distance simply doesn't tensorize well. A saving grace, however, is the fact that TV is easily bounded by many $f$-divergences (e.g., Pinsker's inequality says $V le sqrt2 D_KL.$), which is many times sufficient to solve the problem at hand. Alternately one has the choice to 'metrise' the problems using some other divergence. KL, $chi^2,$ Jensen-Shannon and Hellinger are popular choices, with the last two both being symmetric in their arguments, and being 'close to' actual norms, if that is important.
$endgroup$
– stochasticboy321
Sep 14 '18 at 2:49
5
5
$begingroup$
No, we don't. The TV distance simply doesn't tensorize well. A saving grace, however, is the fact that TV is easily bounded by many $f$-divergences (e.g., Pinsker's inequality says $V le sqrt2 D_KL.$), which is many times sufficient to solve the problem at hand. Alternately one has the choice to 'metrise' the problems using some other divergence. KL, $chi^2,$ Jensen-Shannon and Hellinger are popular choices, with the last two both being symmetric in their arguments, and being 'close to' actual norms, if that is important.
$endgroup$
– stochasticboy321
Sep 14 '18 at 2:49
$begingroup$
No, we don't. The TV distance simply doesn't tensorize well. A saving grace, however, is the fact that TV is easily bounded by many $f$-divergences (e.g., Pinsker's inequality says $V le sqrt2 D_KL.$), which is many times sufficient to solve the problem at hand. Alternately one has the choice to 'metrise' the problems using some other divergence. KL, $chi^2,$ Jensen-Shannon and Hellinger are popular choices, with the last two both being symmetric in their arguments, and being 'close to' actual norms, if that is important.
$endgroup$
– stochasticboy321
Sep 14 '18 at 2:49
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
The property of the KL divergence, is that it has a "chain rule" which leads to tensorization i.e. a property in one dimension holding in multiple dimensions.
The TV distance, while being convenient because it's a distance (and having a nice formula in simple spaces) does not tensorize the way the divergences do. This is because, by Monge-Kantorovich duality (which is what Blackbird used in his answer in the statement above) the TV distance is a minimum over the expectation under some coupling, of inequality. Certainly, assuming independence we get the trivial bound derived by Blackbird. However, because of the fact that there could be more "structure" to the couplings, and this structure could potentially be very complicated as the dimension increases (essentially, you have more coordinates to couple and play with), there is more "between the components" than there is in the components themselves. This is why one would expect just trivial bounds for the TV distance in larger dimensions.
However, with MK Duality, we convert the Bobkov-Gotze theorem (a generalization of Pinsker) to a "transportation cost inequality", which does tensorize. So in that case, bounds on the individual TV distances do give bounds on the big TV distance. Marton's theorem is the one I know here : find it in Van Handel's notes chaptet 4 section 4 : https://web.math.princeton.edu/~rvan/APC550.pdf
Yes, in general it is very important to choose which distance you work with when you are dealing with inequalities. The fact that TV doesn't tensorize serves to emphasize that.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2915894%2fvariational-distance-of-product-of-distributions%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
The property of the KL divergence, is that it has a "chain rule" which leads to tensorization i.e. a property in one dimension holding in multiple dimensions.
The TV distance, while being convenient because it's a distance (and having a nice formula in simple spaces) does not tensorize the way the divergences do. This is because, by Monge-Kantorovich duality (which is what Blackbird used in his answer in the statement above) the TV distance is a minimum over the expectation under some coupling, of inequality. Certainly, assuming independence we get the trivial bound derived by Blackbird. However, because of the fact that there could be more "structure" to the couplings, and this structure could potentially be very complicated as the dimension increases (essentially, you have more coordinates to couple and play with), there is more "between the components" than there is in the components themselves. This is why one would expect just trivial bounds for the TV distance in larger dimensions.
However, with MK Duality, we convert the Bobkov-Gotze theorem (a generalization of Pinsker) to a "transportation cost inequality", which does tensorize. So in that case, bounds on the individual TV distances do give bounds on the big TV distance. Marton's theorem is the one I know here : find it in Van Handel's notes chaptet 4 section 4 : https://web.math.princeton.edu/~rvan/APC550.pdf
Yes, in general it is very important to choose which distance you work with when you are dealing with inequalities. The fact that TV doesn't tensorize serves to emphasize that.
$endgroup$
add a comment |
$begingroup$
The property of the KL divergence, is that it has a "chain rule" which leads to tensorization i.e. a property in one dimension holding in multiple dimensions.
The TV distance, while being convenient because it's a distance (and having a nice formula in simple spaces) does not tensorize the way the divergences do. This is because, by Monge-Kantorovich duality (which is what Blackbird used in his answer in the statement above) the TV distance is a minimum over the expectation under some coupling, of inequality. Certainly, assuming independence we get the trivial bound derived by Blackbird. However, because of the fact that there could be more "structure" to the couplings, and this structure could potentially be very complicated as the dimension increases (essentially, you have more coordinates to couple and play with), there is more "between the components" than there is in the components themselves. This is why one would expect just trivial bounds for the TV distance in larger dimensions.
However, with MK Duality, we convert the Bobkov-Gotze theorem (a generalization of Pinsker) to a "transportation cost inequality", which does tensorize. So in that case, bounds on the individual TV distances do give bounds on the big TV distance. Marton's theorem is the one I know here : find it in Van Handel's notes chaptet 4 section 4 : https://web.math.princeton.edu/~rvan/APC550.pdf
Yes, in general it is very important to choose which distance you work with when you are dealing with inequalities. The fact that TV doesn't tensorize serves to emphasize that.
$endgroup$
add a comment |
$begingroup$
The property of the KL divergence, is that it has a "chain rule" which leads to tensorization i.e. a property in one dimension holding in multiple dimensions.
The TV distance, while being convenient because it's a distance (and having a nice formula in simple spaces) does not tensorize the way the divergences do. This is because, by Monge-Kantorovich duality (which is what Blackbird used in his answer in the statement above) the TV distance is a minimum over the expectation under some coupling, of inequality. Certainly, assuming independence we get the trivial bound derived by Blackbird. However, because of the fact that there could be more "structure" to the couplings, and this structure could potentially be very complicated as the dimension increases (essentially, you have more coordinates to couple and play with), there is more "between the components" than there is in the components themselves. This is why one would expect just trivial bounds for the TV distance in larger dimensions.
However, with MK Duality, we convert the Bobkov-Gotze theorem (a generalization of Pinsker) to a "transportation cost inequality", which does tensorize. So in that case, bounds on the individual TV distances do give bounds on the big TV distance. Marton's theorem is the one I know here : find it in Van Handel's notes chaptet 4 section 4 : https://web.math.princeton.edu/~rvan/APC550.pdf
Yes, in general it is very important to choose which distance you work with when you are dealing with inequalities. The fact that TV doesn't tensorize serves to emphasize that.
$endgroup$
The property of the KL divergence, is that it has a "chain rule" which leads to tensorization i.e. a property in one dimension holding in multiple dimensions.
The TV distance, while being convenient because it's a distance (and having a nice formula in simple spaces) does not tensorize the way the divergences do. This is because, by Monge-Kantorovich duality (which is what Blackbird used in his answer in the statement above) the TV distance is a minimum over the expectation under some coupling, of inequality. Certainly, assuming independence we get the trivial bound derived by Blackbird. However, because of the fact that there could be more "structure" to the couplings, and this structure could potentially be very complicated as the dimension increases (essentially, you have more coordinates to couple and play with), there is more "between the components" than there is in the components themselves. This is why one would expect just trivial bounds for the TV distance in larger dimensions.
However, with MK Duality, we convert the Bobkov-Gotze theorem (a generalization of Pinsker) to a "transportation cost inequality", which does tensorize. So in that case, bounds on the individual TV distances do give bounds on the big TV distance. Marton's theorem is the one I know here : find it in Van Handel's notes chaptet 4 section 4 : https://web.math.princeton.edu/~rvan/APC550.pdf
Yes, in general it is very important to choose which distance you work with when you are dealing with inequalities. The fact that TV doesn't tensorize serves to emphasize that.
answered yesterday
астон вілла олоф мэллбэргастон вілла олоф мэллбэрг
40.3k33678
40.3k33678
add a comment |
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2915894%2fvariational-distance-of-product-of-distributions%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
5
$begingroup$
No, we don't. The TV distance simply doesn't tensorize well. A saving grace, however, is the fact that TV is easily bounded by many $f$-divergences (e.g., Pinsker's inequality says $V le sqrt2 D_KL.$), which is many times sufficient to solve the problem at hand. Alternately one has the choice to 'metrise' the problems using some other divergence. KL, $chi^2,$ Jensen-Shannon and Hellinger are popular choices, with the last two both being symmetric in their arguments, and being 'close to' actual norms, if that is important.
$endgroup$
– stochasticboy321
Sep 14 '18 at 2:49