Variational distance of product of distributionsNormal sample: alternative proof of $(n-1)S^2|bar X sim chi^2(n-1)$Likelihood Function for the Uniform Density.Prove that the random variables $X_1,cdots,X_n$ are independentProduct of Independent Random Variables - Conditional ExpectationDo we not have that $X_1, ldots, X_n sim F$ iff $X_1, ldots, X_n sim f$?Sufficient statistic of $f(theta;x)=theta x^-2I_[theta,infty)(x)$Divergence between two PDFs - Upper boundPDE approaches to calculate the Earth Movers Distance?Test paramater of density using a ratio testCompute probability that mean is above a given value given samples

Have astronauts in space suits ever taken selfies? If so, how?

Do VLANs within a subnet need to have their own subnet for router on a stick?

Why are electrically insulating heatsinks so rare? Is it just cost?

Adding span tags within wp_list_pages list items

Python: next in for loop

How did the USSR manage to innovate in an environment characterized by government censorship and high bureaucracy?

Expeditious Retreat

How do we improve the relationship with a client software team that performs poorly and is becoming less collaborative?

Is a tag line useful on a cover?

The use of multiple foreign keys on same column in SQL Server

Modeling an IPv4 Address

How much RAM could one put in a typical 80386 setup?

Which models of the Boeing 737 are still in production?

How could an uplifted falcon's brain work?

Did Shadowfax go to Valinor?

How is the claim "I am in New York only if I am in America" the same as "If I am in New York, then I am in America?

I’m planning on buying a laser printer but concerned about the life cycle of toner in the machine

To string or not to string

How is it possible to have an ability score that is less than 3?

What are the differences between the usage of 'it' and 'they'?

"to be prejudice towards/against someone" vs "to be prejudiced against/towards someone"

Is this a crack on the carbon frame?

Can an x86 CPU running in real mode be considered to be basically an 8086 CPU?

How does one intimidate enemies without having the capacity for violence?



Variational distance of product of distributions


Normal sample: alternative proof of $(n-1)S^2|bar X sim chi^2(n-1)$Likelihood Function for the Uniform Density.Prove that the random variables $X_1,cdots,X_n$ are independentProduct of Independent Random Variables - Conditional ExpectationDo we not have that $X_1, ldots, X_n sim F$ iff $X_1, ldots, X_n sim f$?Sufficient statistic of $f(theta;x)=theta x^-2I_[theta,infty)(x)$Divergence between two PDFs - Upper boundPDE approaches to calculate the Earth Movers Distance?Test paramater of density using a ratio testCompute probability that mean is above a given value given samples













5












$begingroup$



Let $F(barx)=prod_i=1^nf(x_i)$ and $G(barx)=prod_i=1^ng(x_i)$, where $f(x)$ and $g(x)$ are probability density functions, and $barx=(x_1,ldots,x_n)$. The variational distance between $F$ and $G$ is:
$$V(F,G)=int |F(barx)-G(barx)|dbarx$$
Can we write it in terms of the variational distance between $f$ and $g$?




I know we could do such if it was KL divergence:
$$D_KL(F,G)=n D_KL(f,g).$$ Do we have such a simplification for variational distance as well?










share|cite|improve this question









$endgroup$







  • 5




    $begingroup$
    No, we don't. The TV distance simply doesn't tensorize well. A saving grace, however, is the fact that TV is easily bounded by many $f$-divergences (e.g., Pinsker's inequality says $V le sqrt2 D_KL.$), which is many times sufficient to solve the problem at hand. Alternately one has the choice to 'metrise' the problems using some other divergence. KL, $chi^2,$ Jensen-Shannon and Hellinger are popular choices, with the last two both being symmetric in their arguments, and being 'close to' actual norms, if that is important.
    $endgroup$
    – stochasticboy321
    Sep 14 '18 at 2:49















5












$begingroup$



Let $F(barx)=prod_i=1^nf(x_i)$ and $G(barx)=prod_i=1^ng(x_i)$, where $f(x)$ and $g(x)$ are probability density functions, and $barx=(x_1,ldots,x_n)$. The variational distance between $F$ and $G$ is:
$$V(F,G)=int |F(barx)-G(barx)|dbarx$$
Can we write it in terms of the variational distance between $f$ and $g$?




I know we could do such if it was KL divergence:
$$D_KL(F,G)=n D_KL(f,g).$$ Do we have such a simplification for variational distance as well?










share|cite|improve this question









$endgroup$







  • 5




    $begingroup$
    No, we don't. The TV distance simply doesn't tensorize well. A saving grace, however, is the fact that TV is easily bounded by many $f$-divergences (e.g., Pinsker's inequality says $V le sqrt2 D_KL.$), which is many times sufficient to solve the problem at hand. Alternately one has the choice to 'metrise' the problems using some other divergence. KL, $chi^2,$ Jensen-Shannon and Hellinger are popular choices, with the last two both being symmetric in their arguments, and being 'close to' actual norms, if that is important.
    $endgroup$
    – stochasticboy321
    Sep 14 '18 at 2:49













5












5








5


2



$begingroup$



Let $F(barx)=prod_i=1^nf(x_i)$ and $G(barx)=prod_i=1^ng(x_i)$, where $f(x)$ and $g(x)$ are probability density functions, and $barx=(x_1,ldots,x_n)$. The variational distance between $F$ and $G$ is:
$$V(F,G)=int |F(barx)-G(barx)|dbarx$$
Can we write it in terms of the variational distance between $f$ and $g$?




I know we could do such if it was KL divergence:
$$D_KL(F,G)=n D_KL(f,g).$$ Do we have such a simplification for variational distance as well?










share|cite|improve this question









$endgroup$





Let $F(barx)=prod_i=1^nf(x_i)$ and $G(barx)=prod_i=1^ng(x_i)$, where $f(x)$ and $g(x)$ are probability density functions, and $barx=(x_1,ldots,x_n)$. The variational distance between $F$ and $G$ is:
$$V(F,G)=int |F(barx)-G(barx)|dbarx$$
Can we write it in terms of the variational distance between $f$ and $g$?




I know we could do such if it was KL divergence:
$$D_KL(F,G)=n D_KL(f,g).$$ Do we have such a simplification for variational distance as well?







probability probability-theory statistics information-theory total-variation






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Sep 13 '18 at 17:38









AlbertAlbert

578




578







  • 5




    $begingroup$
    No, we don't. The TV distance simply doesn't tensorize well. A saving grace, however, is the fact that TV is easily bounded by many $f$-divergences (e.g., Pinsker's inequality says $V le sqrt2 D_KL.$), which is many times sufficient to solve the problem at hand. Alternately one has the choice to 'metrise' the problems using some other divergence. KL, $chi^2,$ Jensen-Shannon and Hellinger are popular choices, with the last two both being symmetric in their arguments, and being 'close to' actual norms, if that is important.
    $endgroup$
    – stochasticboy321
    Sep 14 '18 at 2:49












  • 5




    $begingroup$
    No, we don't. The TV distance simply doesn't tensorize well. A saving grace, however, is the fact that TV is easily bounded by many $f$-divergences (e.g., Pinsker's inequality says $V le sqrt2 D_KL.$), which is many times sufficient to solve the problem at hand. Alternately one has the choice to 'metrise' the problems using some other divergence. KL, $chi^2,$ Jensen-Shannon and Hellinger are popular choices, with the last two both being symmetric in their arguments, and being 'close to' actual norms, if that is important.
    $endgroup$
    – stochasticboy321
    Sep 14 '18 at 2:49







5




5




$begingroup$
No, we don't. The TV distance simply doesn't tensorize well. A saving grace, however, is the fact that TV is easily bounded by many $f$-divergences (e.g., Pinsker's inequality says $V le sqrt2 D_KL.$), which is many times sufficient to solve the problem at hand. Alternately one has the choice to 'metrise' the problems using some other divergence. KL, $chi^2,$ Jensen-Shannon and Hellinger are popular choices, with the last two both being symmetric in their arguments, and being 'close to' actual norms, if that is important.
$endgroup$
– stochasticboy321
Sep 14 '18 at 2:49




$begingroup$
No, we don't. The TV distance simply doesn't tensorize well. A saving grace, however, is the fact that TV is easily bounded by many $f$-divergences (e.g., Pinsker's inequality says $V le sqrt2 D_KL.$), which is many times sufficient to solve the problem at hand. Alternately one has the choice to 'metrise' the problems using some other divergence. KL, $chi^2,$ Jensen-Shannon and Hellinger are popular choices, with the last two both being symmetric in their arguments, and being 'close to' actual norms, if that is important.
$endgroup$
– stochasticboy321
Sep 14 '18 at 2:49










1 Answer
1






active

oldest

votes


















0












$begingroup$

The property of the KL divergence, is that it has a "chain rule" which leads to tensorization i.e. a property in one dimension holding in multiple dimensions.



The TV distance, while being convenient because it's a distance (and having a nice formula in simple spaces) does not tensorize the way the divergences do. This is because, by Monge-Kantorovich duality (which is what Blackbird used in his answer in the statement above) the TV distance is a minimum over the expectation under some coupling, of inequality. Certainly, assuming independence we get the trivial bound derived by Blackbird. However, because of the fact that there could be more "structure" to the couplings, and this structure could potentially be very complicated as the dimension increases (essentially, you have more coordinates to couple and play with), there is more "between the components" than there is in the components themselves. This is why one would expect just trivial bounds for the TV distance in larger dimensions.



However, with MK Duality, we convert the Bobkov-Gotze theorem (a generalization of Pinsker) to a "transportation cost inequality", which does tensorize. So in that case, bounds on the individual TV distances do give bounds on the big TV distance. Marton's theorem is the one I know here : find it in Van Handel's notes chaptet 4 section 4 : https://web.math.princeton.edu/~rvan/APC550.pdf



Yes, in general it is very important to choose which distance you work with when you are dealing with inequalities. The fact that TV doesn't tensorize serves to emphasize that.






share|cite|improve this answer









$endgroup$













    Your Answer





    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "69"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2915894%2fvariational-distance-of-product-of-distributions%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0












    $begingroup$

    The property of the KL divergence, is that it has a "chain rule" which leads to tensorization i.e. a property in one dimension holding in multiple dimensions.



    The TV distance, while being convenient because it's a distance (and having a nice formula in simple spaces) does not tensorize the way the divergences do. This is because, by Monge-Kantorovich duality (which is what Blackbird used in his answer in the statement above) the TV distance is a minimum over the expectation under some coupling, of inequality. Certainly, assuming independence we get the trivial bound derived by Blackbird. However, because of the fact that there could be more "structure" to the couplings, and this structure could potentially be very complicated as the dimension increases (essentially, you have more coordinates to couple and play with), there is more "between the components" than there is in the components themselves. This is why one would expect just trivial bounds for the TV distance in larger dimensions.



    However, with MK Duality, we convert the Bobkov-Gotze theorem (a generalization of Pinsker) to a "transportation cost inequality", which does tensorize. So in that case, bounds on the individual TV distances do give bounds on the big TV distance. Marton's theorem is the one I know here : find it in Van Handel's notes chaptet 4 section 4 : https://web.math.princeton.edu/~rvan/APC550.pdf



    Yes, in general it is very important to choose which distance you work with when you are dealing with inequalities. The fact that TV doesn't tensorize serves to emphasize that.






    share|cite|improve this answer









    $endgroup$

















      0












      $begingroup$

      The property of the KL divergence, is that it has a "chain rule" which leads to tensorization i.e. a property in one dimension holding in multiple dimensions.



      The TV distance, while being convenient because it's a distance (and having a nice formula in simple spaces) does not tensorize the way the divergences do. This is because, by Monge-Kantorovich duality (which is what Blackbird used in his answer in the statement above) the TV distance is a minimum over the expectation under some coupling, of inequality. Certainly, assuming independence we get the trivial bound derived by Blackbird. However, because of the fact that there could be more "structure" to the couplings, and this structure could potentially be very complicated as the dimension increases (essentially, you have more coordinates to couple and play with), there is more "between the components" than there is in the components themselves. This is why one would expect just trivial bounds for the TV distance in larger dimensions.



      However, with MK Duality, we convert the Bobkov-Gotze theorem (a generalization of Pinsker) to a "transportation cost inequality", which does tensorize. So in that case, bounds on the individual TV distances do give bounds on the big TV distance. Marton's theorem is the one I know here : find it in Van Handel's notes chaptet 4 section 4 : https://web.math.princeton.edu/~rvan/APC550.pdf



      Yes, in general it is very important to choose which distance you work with when you are dealing with inequalities. The fact that TV doesn't tensorize serves to emphasize that.






      share|cite|improve this answer









      $endgroup$















        0












        0








        0





        $begingroup$

        The property of the KL divergence, is that it has a "chain rule" which leads to tensorization i.e. a property in one dimension holding in multiple dimensions.



        The TV distance, while being convenient because it's a distance (and having a nice formula in simple spaces) does not tensorize the way the divergences do. This is because, by Monge-Kantorovich duality (which is what Blackbird used in his answer in the statement above) the TV distance is a minimum over the expectation under some coupling, of inequality. Certainly, assuming independence we get the trivial bound derived by Blackbird. However, because of the fact that there could be more "structure" to the couplings, and this structure could potentially be very complicated as the dimension increases (essentially, you have more coordinates to couple and play with), there is more "between the components" than there is in the components themselves. This is why one would expect just trivial bounds for the TV distance in larger dimensions.



        However, with MK Duality, we convert the Bobkov-Gotze theorem (a generalization of Pinsker) to a "transportation cost inequality", which does tensorize. So in that case, bounds on the individual TV distances do give bounds on the big TV distance. Marton's theorem is the one I know here : find it in Van Handel's notes chaptet 4 section 4 : https://web.math.princeton.edu/~rvan/APC550.pdf



        Yes, in general it is very important to choose which distance you work with when you are dealing with inequalities. The fact that TV doesn't tensorize serves to emphasize that.






        share|cite|improve this answer









        $endgroup$



        The property of the KL divergence, is that it has a "chain rule" which leads to tensorization i.e. a property in one dimension holding in multiple dimensions.



        The TV distance, while being convenient because it's a distance (and having a nice formula in simple spaces) does not tensorize the way the divergences do. This is because, by Monge-Kantorovich duality (which is what Blackbird used in his answer in the statement above) the TV distance is a minimum over the expectation under some coupling, of inequality. Certainly, assuming independence we get the trivial bound derived by Blackbird. However, because of the fact that there could be more "structure" to the couplings, and this structure could potentially be very complicated as the dimension increases (essentially, you have more coordinates to couple and play with), there is more "between the components" than there is in the components themselves. This is why one would expect just trivial bounds for the TV distance in larger dimensions.



        However, with MK Duality, we convert the Bobkov-Gotze theorem (a generalization of Pinsker) to a "transportation cost inequality", which does tensorize. So in that case, bounds on the individual TV distances do give bounds on the big TV distance. Marton's theorem is the one I know here : find it in Van Handel's notes chaptet 4 section 4 : https://web.math.princeton.edu/~rvan/APC550.pdf



        Yes, in general it is very important to choose which distance you work with when you are dealing with inequalities. The fact that TV doesn't tensorize serves to emphasize that.







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered yesterday









        астон вілла олоф мэллбэргастон вілла олоф мэллбэрг

        40.3k33678




        40.3k33678



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Mathematics Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2915894%2fvariational-distance-of-product-of-distributions%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Triangular numbers and gcdProving sum of a set is $0 pmod n$ if $n$ is odd, or $fracn2 pmod n$ if $n$ is even?Is greatest common divisor of two numbers really their smallest linear combination?GCD, LCM RelationshipProve a set of nonnegative integers with greatest common divisor 1 and closed under addition has all but finite many nonnegative integers.all pairs of a and b in an equation containing gcdTriangular Numbers Modulo $k$ - Hit All Values?Understanding the Existence and Uniqueness of the GCDGCD and LCM with logical symbolsThe greatest common divisor of two positive integers less than 100 is equal to 3. Their least common multiple is twelve times one of the integers.Suppose that for all integers $x$, $x|a$ and $x|b$ if and only if $x|c$. Then $c = gcd(a,b)$Which is the gcd of 2 numbers which are multiplied and the result is 600000?

            Ingelân Ynhâld Etymology | Geografy | Skiednis | Polityk en bestjoer | Ekonomy | Demografy | Kultuer | Klimaat | Sjoch ek | Keppelings om utens | Boarnen, noaten en referinsjes Navigaasjemenuwww.gov.ukOffisjele webside fan it regear fan it Feriene KeninkrykOffisjele webside fan it Britske FerkearsburoNederlânsktalige ynformaasje fan it Britske FerkearsburoOffisjele webside fan English Heritage, de organisaasje dy't him ynset foar it behâld fan it Ingelske kultuergoedYnwennertallen fan alle Britske stêden út 'e folkstelling fan 2011Notes en References, op dizze sideEngland

            Boston (Lincolnshire) Stedsbyld | Berne yn Boston | NavigaasjemenuBoston Borough CouncilBoston, Lincolnshire