Why, intuitively, is the order reversed when taking the transpose of the product? The 2019 Stack Overflow Developer Survey Results Are Inif $x,h in mathbbR^d$ and $A in mathbbR^dtimes d$ is it possible to justify that $(x^TAh)^T = h^TA^Tx$?dual space is a vector spaceLooking for supplemental book or papers for honors high school student learning geometry proofs.Size of conjugacy classes in $GL(4,2)$Intuitively when to use the wedge product?Linear Algebra Review QuestionsWhy do some mathematical ideas seem counter-intuitive?Proof using matrices, when does $XA=XB$ imply $A=B$?Determinant when rows reversedHelp with understanding the proof for: $AB$ and $BA$ have the same characteristic polynomial (for square complex matrices)Continuous dependence of matrix elementsVectorization of expressions: How to develop a visual intution

JSON.serialize: is it possible to suppress null values of a map?

What do the Banks children have against barley water?

How to answer pointed "are you quitting" questioning when I don't want them to suspect

What tool would a Roman-age civilization have to grind silver and other metals into dust?

What is the meaning of Triage in Cybersec world?

Dual Citizen. Exited the US on Italian passport recently

Output the Arecibo Message

Lethal sonic weapons

How to manage monthly salary

Should I write numbers in words or as numerals when there are multiple next to each other?

Which Sci-Fi work first showed weapon of galactic-scale mass destruction?

Deadlock Graph and Interpretation, solution to avoid

What is the use of option -o in the useradd command?

In microwave frequencies, do you use a circulator when you need a (near) perfect diode?

Idiomatic way to prevent slicing?

Is bread bad for ducks?

Is it possible for the two major parties in the UK to form a coalition with each other instead of a much smaller party?

Where to refill my bottle in India?

Inversion Puzzle

Carnot-Caratheodory metric

Why is Grand Jury testimony secret?

Can we apply L'Hospital's rule where the derivative is not continuous?

Access elements in std::string where positon of string is greater than its size

Inflated grade on resume at previous job, might former employer tell new employer?



Why, intuitively, is the order reversed when taking the transpose of the product?



The 2019 Stack Overflow Developer Survey Results Are Inif $x,h in mathbbR^d$ and $A in mathbbR^dtimes d$ is it possible to justify that $(x^TAh)^T = h^TA^Tx$?dual space is a vector spaceLooking for supplemental book or papers for honors high school student learning geometry proofs.Size of conjugacy classes in $GL(4,2)$Intuitively when to use the wedge product?Linear Algebra Review QuestionsWhy do some mathematical ideas seem counter-intuitive?Proof using matrices, when does $XA=XB$ imply $A=B$?Determinant when rows reversedHelp with understanding the proof for: $AB$ and $BA$ have the same characteristic polynomial (for square complex matrices)Continuous dependence of matrix elementsVectorization of expressions: How to develop a visual intution










91












$begingroup$


It is well known that for invertible matrices $A,B$ of the same size we have $$(AB)^-1=B^-1A^-1 $$
and a nice way for me to remember this is the following sentence:




The opposite of putting on socks and shoes is taking the shoes off, followed by taking the socks off.




Now, a similar law holds for the transpose, namely:



$$(AB)^T=B^TA^T $$



for matrices $A,B$ such that the product $AB$ is defined. My question is: is there any intuitive reason as to why the order of the factors is reversed in this case?



[Note that I'm aware of several proofs of this equality, and a proof is not what I'm after]



Thank you!










share|cite|improve this question











$endgroup$







  • 14




    $begingroup$
    The transpose identity holds just as well when $A$ and $B$ are not square; if $A$ has size $m times n$ and $B$ has size $n times p$, where $p neq m$, then the given order is the only possible order of multiplication of $A^T$ and $B^T$.
    $endgroup$
    – Travis
    May 13 '15 at 6:48






  • 1




    $begingroup$
    @Travis I never required the matrices to be square for the transpose identity: all I said was that the product $AB$ must be defined.
    $endgroup$
    – user1337
    May 13 '15 at 6:53






  • 7




    $begingroup$
    @user1337, I think Travis was just using non-square matrices as a way of seeing why we must have $(AB)^T = B^T A^T$ and not $A^TB^T$. If $A$ is $l times m$ and $B$ is $m times n$ then $AB$ makes sense and $B^T A^T$ is an $n times m$ times a $m times l$ which makes sense, but $A^T B^T$ doesn't work.
    $endgroup$
    – Jair Taylor
    May 13 '15 at 6:58






  • 2




    $begingroup$
    @user1337 Jair's right, I didn't intend my comment as a sort of correction, just as an explanation that if there is some identity for $(AB)^T$ of the given sort that holds for all matrix products, matrix size alone forces a particular order. (BTW, Jair, I finished my Ph.D. at Washington a few years ago.)
    $endgroup$
    – Travis
    May 13 '15 at 8:05










  • $begingroup$
    @Travis: Ah, I think I know who you are. I'm working with Sara Billey. I like UW a lot. :)
    $endgroup$
    – Jair Taylor
    May 13 '15 at 15:33















91












$begingroup$


It is well known that for invertible matrices $A,B$ of the same size we have $$(AB)^-1=B^-1A^-1 $$
and a nice way for me to remember this is the following sentence:




The opposite of putting on socks and shoes is taking the shoes off, followed by taking the socks off.




Now, a similar law holds for the transpose, namely:



$$(AB)^T=B^TA^T $$



for matrices $A,B$ such that the product $AB$ is defined. My question is: is there any intuitive reason as to why the order of the factors is reversed in this case?



[Note that I'm aware of several proofs of this equality, and a proof is not what I'm after]



Thank you!










share|cite|improve this question











$endgroup$







  • 14




    $begingroup$
    The transpose identity holds just as well when $A$ and $B$ are not square; if $A$ has size $m times n$ and $B$ has size $n times p$, where $p neq m$, then the given order is the only possible order of multiplication of $A^T$ and $B^T$.
    $endgroup$
    – Travis
    May 13 '15 at 6:48






  • 1




    $begingroup$
    @Travis I never required the matrices to be square for the transpose identity: all I said was that the product $AB$ must be defined.
    $endgroup$
    – user1337
    May 13 '15 at 6:53






  • 7




    $begingroup$
    @user1337, I think Travis was just using non-square matrices as a way of seeing why we must have $(AB)^T = B^T A^T$ and not $A^TB^T$. If $A$ is $l times m$ and $B$ is $m times n$ then $AB$ makes sense and $B^T A^T$ is an $n times m$ times a $m times l$ which makes sense, but $A^T B^T$ doesn't work.
    $endgroup$
    – Jair Taylor
    May 13 '15 at 6:58






  • 2




    $begingroup$
    @user1337 Jair's right, I didn't intend my comment as a sort of correction, just as an explanation that if there is some identity for $(AB)^T$ of the given sort that holds for all matrix products, matrix size alone forces a particular order. (BTW, Jair, I finished my Ph.D. at Washington a few years ago.)
    $endgroup$
    – Travis
    May 13 '15 at 8:05










  • $begingroup$
    @Travis: Ah, I think I know who you are. I'm working with Sara Billey. I like UW a lot. :)
    $endgroup$
    – Jair Taylor
    May 13 '15 at 15:33













91












91








91


35



$begingroup$


It is well known that for invertible matrices $A,B$ of the same size we have $$(AB)^-1=B^-1A^-1 $$
and a nice way for me to remember this is the following sentence:




The opposite of putting on socks and shoes is taking the shoes off, followed by taking the socks off.




Now, a similar law holds for the transpose, namely:



$$(AB)^T=B^TA^T $$



for matrices $A,B$ such that the product $AB$ is defined. My question is: is there any intuitive reason as to why the order of the factors is reversed in this case?



[Note that I'm aware of several proofs of this equality, and a proof is not what I'm after]



Thank you!










share|cite|improve this question











$endgroup$




It is well known that for invertible matrices $A,B$ of the same size we have $$(AB)^-1=B^-1A^-1 $$
and a nice way for me to remember this is the following sentence:




The opposite of putting on socks and shoes is taking the shoes off, followed by taking the socks off.




Now, a similar law holds for the transpose, namely:



$$(AB)^T=B^TA^T $$



for matrices $A,B$ such that the product $AB$ is defined. My question is: is there any intuitive reason as to why the order of the factors is reversed in this case?



[Note that I'm aware of several proofs of this equality, and a proof is not what I'm after]



Thank you!







linear-algebra matrices soft-question intuition






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Mar 30 at 10:58









GNUSupporter 8964民主女神 地下教會

14k82651




14k82651










asked May 13 '15 at 6:35









user1337user1337

16.9k43594




16.9k43594







  • 14




    $begingroup$
    The transpose identity holds just as well when $A$ and $B$ are not square; if $A$ has size $m times n$ and $B$ has size $n times p$, where $p neq m$, then the given order is the only possible order of multiplication of $A^T$ and $B^T$.
    $endgroup$
    – Travis
    May 13 '15 at 6:48






  • 1




    $begingroup$
    @Travis I never required the matrices to be square for the transpose identity: all I said was that the product $AB$ must be defined.
    $endgroup$
    – user1337
    May 13 '15 at 6:53






  • 7




    $begingroup$
    @user1337, I think Travis was just using non-square matrices as a way of seeing why we must have $(AB)^T = B^T A^T$ and not $A^TB^T$. If $A$ is $l times m$ and $B$ is $m times n$ then $AB$ makes sense and $B^T A^T$ is an $n times m$ times a $m times l$ which makes sense, but $A^T B^T$ doesn't work.
    $endgroup$
    – Jair Taylor
    May 13 '15 at 6:58






  • 2




    $begingroup$
    @user1337 Jair's right, I didn't intend my comment as a sort of correction, just as an explanation that if there is some identity for $(AB)^T$ of the given sort that holds for all matrix products, matrix size alone forces a particular order. (BTW, Jair, I finished my Ph.D. at Washington a few years ago.)
    $endgroup$
    – Travis
    May 13 '15 at 8:05










  • $begingroup$
    @Travis: Ah, I think I know who you are. I'm working with Sara Billey. I like UW a lot. :)
    $endgroup$
    – Jair Taylor
    May 13 '15 at 15:33












  • 14




    $begingroup$
    The transpose identity holds just as well when $A$ and $B$ are not square; if $A$ has size $m times n$ and $B$ has size $n times p$, where $p neq m$, then the given order is the only possible order of multiplication of $A^T$ and $B^T$.
    $endgroup$
    – Travis
    May 13 '15 at 6:48






  • 1




    $begingroup$
    @Travis I never required the matrices to be square for the transpose identity: all I said was that the product $AB$ must be defined.
    $endgroup$
    – user1337
    May 13 '15 at 6:53






  • 7




    $begingroup$
    @user1337, I think Travis was just using non-square matrices as a way of seeing why we must have $(AB)^T = B^T A^T$ and not $A^TB^T$. If $A$ is $l times m$ and $B$ is $m times n$ then $AB$ makes sense and $B^T A^T$ is an $n times m$ times a $m times l$ which makes sense, but $A^T B^T$ doesn't work.
    $endgroup$
    – Jair Taylor
    May 13 '15 at 6:58






  • 2




    $begingroup$
    @user1337 Jair's right, I didn't intend my comment as a sort of correction, just as an explanation that if there is some identity for $(AB)^T$ of the given sort that holds for all matrix products, matrix size alone forces a particular order. (BTW, Jair, I finished my Ph.D. at Washington a few years ago.)
    $endgroup$
    – Travis
    May 13 '15 at 8:05










  • $begingroup$
    @Travis: Ah, I think I know who you are. I'm working with Sara Billey. I like UW a lot. :)
    $endgroup$
    – Jair Taylor
    May 13 '15 at 15:33







14




14




$begingroup$
The transpose identity holds just as well when $A$ and $B$ are not square; if $A$ has size $m times n$ and $B$ has size $n times p$, where $p neq m$, then the given order is the only possible order of multiplication of $A^T$ and $B^T$.
$endgroup$
– Travis
May 13 '15 at 6:48




$begingroup$
The transpose identity holds just as well when $A$ and $B$ are not square; if $A$ has size $m times n$ and $B$ has size $n times p$, where $p neq m$, then the given order is the only possible order of multiplication of $A^T$ and $B^T$.
$endgroup$
– Travis
May 13 '15 at 6:48




1




1




$begingroup$
@Travis I never required the matrices to be square for the transpose identity: all I said was that the product $AB$ must be defined.
$endgroup$
– user1337
May 13 '15 at 6:53




$begingroup$
@Travis I never required the matrices to be square for the transpose identity: all I said was that the product $AB$ must be defined.
$endgroup$
– user1337
May 13 '15 at 6:53




7




7




$begingroup$
@user1337, I think Travis was just using non-square matrices as a way of seeing why we must have $(AB)^T = B^T A^T$ and not $A^TB^T$. If $A$ is $l times m$ and $B$ is $m times n$ then $AB$ makes sense and $B^T A^T$ is an $n times m$ times a $m times l$ which makes sense, but $A^T B^T$ doesn't work.
$endgroup$
– Jair Taylor
May 13 '15 at 6:58




$begingroup$
@user1337, I think Travis was just using non-square matrices as a way of seeing why we must have $(AB)^T = B^T A^T$ and not $A^TB^T$. If $A$ is $l times m$ and $B$ is $m times n$ then $AB$ makes sense and $B^T A^T$ is an $n times m$ times a $m times l$ which makes sense, but $A^T B^T$ doesn't work.
$endgroup$
– Jair Taylor
May 13 '15 at 6:58




2




2




$begingroup$
@user1337 Jair's right, I didn't intend my comment as a sort of correction, just as an explanation that if there is some identity for $(AB)^T$ of the given sort that holds for all matrix products, matrix size alone forces a particular order. (BTW, Jair, I finished my Ph.D. at Washington a few years ago.)
$endgroup$
– Travis
May 13 '15 at 8:05




$begingroup$
@user1337 Jair's right, I didn't intend my comment as a sort of correction, just as an explanation that if there is some identity for $(AB)^T$ of the given sort that holds for all matrix products, matrix size alone forces a particular order. (BTW, Jair, I finished my Ph.D. at Washington a few years ago.)
$endgroup$
– Travis
May 13 '15 at 8:05












$begingroup$
@Travis: Ah, I think I know who you are. I'm working with Sara Billey. I like UW a lot. :)
$endgroup$
– Jair Taylor
May 13 '15 at 15:33




$begingroup$
@Travis: Ah, I think I know who you are. I'm working with Sara Billey. I like UW a lot. :)
$endgroup$
– Jair Taylor
May 13 '15 at 15:33










11 Answers
11






active

oldest

votes


















196












$begingroup$

One of my best college math professor always said:




Make a drawing first.




enter image description here



Although, he couldn't have made this one on the blackboard.






share|cite|improve this answer









$endgroup$








  • 4




    $begingroup$
    This is an absolutely beautiful explanation.
    $endgroup$
    – Roger Burt
    May 13 '15 at 19:32






  • 1




    $begingroup$
    Couldn't believe it can be explained in such a simple and fun way. Must plus one!
    $endgroup$
    – Vim
    May 14 '15 at 9:00






  • 3




    $begingroup$
    This is the most trite answer i have ever seen. I can't decide whether it deserves an upvote or a downvote.
    $endgroup$
    – enthdegree
    May 16 '15 at 9:16







  • 3




    $begingroup$
    (+1) @enthdegree: An upvote, particularly if the drawing were augmented to indicate the $i$th row of $A$, the $j$th column of $B$, and the $(i, j)$ entry of $AB$ (all in marker heavy enough to bleed through the paper, of course), so that when the paper is flipped the OP's question is immediately answered with justification. :)
    $endgroup$
    – Andrew D. Hwang
    May 16 '15 at 11:56






  • 2




    $begingroup$
    Sparsity in this drawing is by design. The more clutter you add, the more confuse it becomes. Readers familiar with matrix multiplication will probably have drawn this two-faced $(i,j)$ double-line in their mind.
    $endgroup$
    – mdup
    May 17 '15 at 10:51



















33












$begingroup$

By dualizing $AB: V_1stackrelBlongrightarrow V_2stackrelAlongrightarrowV_3$, we have $(AB)^T: V_3^*stackrelA^TlongrightarrowV_2^*stackrelB^TlongrightarrowV_1^*$.



Edit: $V^*$ is the dual space $textHom(V, mathbbF)$, the vector space of linear transformations from $V$ to its ground field, and if $A: V_1to V_2$ is a linear transformation, then $A^T: V_2^*to V_1^*$ is its dual defined by $A^T(f)=fcirc A$. By abuse of notation, if $A$ is the matrix representation with respect to bases $mathcalB_1$ of $V_1$ and $mathcalB_2$ of $V_2$, then $A^T$ is the matrix representation of the dual map with respect to the dual bases $mathcalB_1^*$ and $mathcalB_2^*$.






share|cite|improve this answer











$endgroup$








  • 9




    $begingroup$
    In other words: the dualizing functor $V rightarrow V^*$ is contravariant.
    $endgroup$
    – Jair Taylor
    May 13 '15 at 6:47











  • $begingroup$
    I think you might elaborate though, explaining what $V^*$ is and what $A^T$ means in this context. Otherwise it seems more like a comment than an answer.
    $endgroup$
    – Jair Taylor
    May 13 '15 at 6:50










  • $begingroup$
    Actually I was trying to make my answer as concise as possible as the OP does not looking for a proof. Sure I could have been more precise.
    $endgroup$
    – Alex Fok
    May 13 '15 at 6:54










  • $begingroup$
    @AlexFok looks neat, however I don't know what dualizing means. Can you please elaborate?
    $endgroup$
    – user1337
    May 13 '15 at 6:56






  • 4




    $begingroup$
    I wish it would be emphasised more in teaching that transposing makes linear operators switch to work on the dual spaces. Many people – at least in science, not sure how it is in maths – aren't aware of this at all; I was never explicitly taught about it and it was a huge a-ha moment when I first found out.
    $endgroup$
    – leftaroundabout
    May 13 '15 at 20:01



















17












$begingroup$

Here's another argument. First note that if $v$ is a column vector then $(Mv)^T = v^T M^T$. This is not hard to see - if you write down an example and do it both ways, you will see you are just doing the same computation with a different notation. Multiplying the column vector $v$ on the right by the rows of $M$ is the same as multiplying the row vector $v^T$ on the left by the columns of $M^T$.



Now let $( cdot , cdot )$ be the usual inner product on $mathbbR^n$, that is, the dot product. Then the transpose $N = M^T$ of a matrix $M$ is the unique matrix $N$ with the property



$$(Mu, v) = (u, Nv).$$



This is just a consequence of associativity of matrix multiplication. The dot product of vectors $u,v$ is given by thinking of $u,v$ as column vectors, taking the transpose of one and doing the dot product: $(u,v) = u^T v$.



Then $(Mu,v) = (Mu)^T v = (u^T M^T) v = u^T (M^Tv) = (u, M^Tv)$.



Exercise: Show uniqueness!



With this alternate definition we can give a shoes-and-socks argument. We have



$$( ABu, v) = (Bu, A^Tv) = (u, B^TA^Tv)$$



for all $u,v$, and so $(AB)^T = B^T A^T$. The argument is exactly the same as the one for inverses, except we are "moving across the inner product" instead of "undoing".






share|cite|improve this answer









$endgroup$








  • 3




    $begingroup$
    The second part is the best way of looking at this. The point is that the transpose is not really that natural of an operation by itself: it is important because it is the adjoint operation for the (real) Euclidean dot product. And the adjoint operation for any inner product has the property in question, for the same reason that the inverse operation has the property in question.
    $endgroup$
    – Ian
    May 13 '15 at 15:22



















7












$begingroup$

A matrix is a collection of entries that may be represented with 2 indices. When we multiply two matrices, each resultant entry is the sum of the products



$$C_ik = sum_j A_ij B_jk $$



Crucially, the 'middle' index, $j$, must be the same for both matrices (the first must be as wide as the second is tall).



A transpose is just a reversal of indices:



$$A_ij^T = A_ji$$



It should now go without saying that



$$C_ik^T = C_ki = (sum_j A_ij B_jk)^T = sum_j B_kj A_ji$$



Memory shortcut: multiplication fails immediately for non-square matrices when you forget to commute for a transpose.






share|cite|improve this answer









$endgroup$




















    6












    $begingroup$

    Each element of the matrix $AB$ is the inner product of a row of $A$ with a column of $B$.



    $(AB)^T$ has the same elements that $AB$ does (just in different places), so its elements too must each come from a row of $A$ and a column of $B$.



    However if we want to start with $A^T$ and $B^T$, then a row of $A$ is the same thing as a column of $A^T$ (and vice versa for $B$ and $B^T$), so we need something that has columns of $A^T$ and rows of $B^T$. The matrix that we take columns from is always the right factor, to $A^T$ must be the right factor in the multiplication.



    Similarly, $B^T$ must be the left factor because we need its rows (which are columns of the original $B$).






    share|cite|improve this answer









    $endgroup$




















      5












      $begingroup$

      the intuitive reason is that the entries of a product matrix are feynman path integrals, and transposing the matrixes corresponds simply to reversing the arrow of time for traveling along the paths.



      (so it's practically the same idea as in your shoes-and-socks example: matrix transposition is about time-reversal, just like function inversion is about time-reversal.)



      the (i,k)th entry in a product matrix ab is the sum over j of a(i,j).b(j,k). in other words, it's a sum over all "2-step paths" (i,j,k) from i to k, each path visiting one intermediate point j on its way from i to k.



      this sum over paths is called a "feynman path integral". if you read feynman's original paper on the subject, focusing on the parts that are easy to understand, you'll see that that was feynman's basic message: that whenever you have a big long string of matrixes to multiply, each entry in the product matrix is a "sum over paths" aka "path integral", with the contribution of each particular path being a long product of "transition quantities", each associated with one transition-step along the path.



      this "path" interpretation of matrix multiplication actually gets more intuitive for longer strings of matrixes, because then each path consists of many steps. for example each entry of a matrix product abc...z is a sum over 26-step paths; each path visits 27 points but with just 26 transition-steps from one point to the next.






      share|cite|improve this answer









      $endgroup$








      • 3




        $begingroup$
        I very much doubt that this explanation will help the OP, but I found the analogy to Feynman path integrals told me something important about the path integrals. I'm not a physicist and never looked past a paragraph or two about them. Now I can see that they resemble counting paths in a graph by looking at powers of the adjacency matrix.
        $endgroup$
        – Ethan Bolker
        May 14 '15 at 14:45







      • 2




        $begingroup$
        I don't think we really need to attach Feynman's name here, but in general this combinatorial view of matrix multiplication as a sum over walks is very helpful. The time-reversal interpretation.is a pretty useful way of looking at the transpose.
        $endgroup$
        – Jair Taylor
        May 15 '15 at 1:04


















      3












      $begingroup$

      [this is an attempt to combine two previously given answers, mdup's video demo and my "path-sum" story, so it might help to refer to those.]



      after watching mdup's video demo i started wondering how it relates to the "path-sum" interpretation of matrix multiplication. the key seems to be that mdup's hand-drawn picture of the matrix product AB wants to be folded up to form the visible faces of an oblong box whose three dimensions correspond precisely to the points i, j, and k in a three-point path (i,j,k). this is illustrated by the pairs of pictures below, each pair showing the oblong box first in its folded-up 3-dimensional form and then in its flattened-out 2-dimensional form. in each case the box is held up to a mirror to portray the effect of transposition of matrixes.



      in the first pair of pictures, the i, j, and k axises are marked, and in the folded-up 3-dimensional form you can see how transposition reverses the order of the axises from i,j,k to k,j,i. in the flattened-out 2-dimensional form you can see how it wants to be folded up because the edges marked j are all the same length (and also, because it was folded up like that when i bought the soap).



      test http://math.ucr.edu/~jdolan/matrix-3d-axises-1.jpg



      test http://math.ucr.edu/~jdolan/matrix-2d-axises-1.jpg



      the second pair of pictures indicate how an entry of the product matrix is calculated. in the flattened-out 2-dimensional form, a row of the first matrix is paired with a column of the second matrix, whereas in the folded-up 3-dimensional form, that "row" and that "column" actually lie parallel to each other because of the 3d arrangement.



      test http://math.ucr.edu/~jdolan/matrix-3d-dotproduct-2.jpg



      test http://math.ucr.edu/~jdolan/matrix-2d-dotproduct-1.jpg



      in other words, each 3-point path (i,j,k) corresponds to a location inside the box, and at that location you write down (using a 3-dimensional printer or else just writing on the air) the product of the transition-quantities for the two transition-steps in the path, A_[i,j] for the transition-step from i to j and B_[j,k] for the transition-step from j to k. this results in a 3-dimensional matrix of numbers written on the air inside the box, but since the desired matrix product AB is only a 2-dimensional matrix, the 3-dimensional matrix is squashed down to 2-dimensional by summing over the j dimension. this is the path-sum- in order for two paths to contribute to the same path-sum they're required to be in direct competition with each other, beginning at the same origin i and ending at the same destination k, so the only index that we sum over is the intermediate index j.



      the 3-dimensional folded-up form and the 2-dimensional flattened-out form have each their own advantages and disadvantages. the 3-dimensional folded-up form brings out the path-sums and the 3-dimensional nature of matrix multiplication, while the 2-dimensional flattened-out form is better-adapted to writing the calculation down on 2-dimensional paper (which remains easier than writing on 3-dimensional air even still today).



      anyway, i'll get off my soapbox for now ...






      share|cite|improve this answer









      $endgroup$




















        1












        $begingroup$

        (Short form of Jair Taylor's answer)



        In the expresson $v^tA B w$, vectors $v$ and $w$ "see" $A$ and $B$ from different ends, hence in different order.






        share|cite|improve this answer









        $endgroup$




















          1












          $begingroup$

          Considering the dimensions of the various matrices shows that reversing the order is necessary.



          If A is $m times p$ and B is $p times n$,



          AB is $m times n$,



          (AB)$^T$ is $n times m$



          A$^T$ is $p times m$ and B$^T$ is $n times p$



          Thus B$^T$A$^T$ has the same dimension as(AB)$^T$






          share|cite|improve this answer









          $endgroup$




















            1












            $begingroup$

            My professor always called them the shoe-sock theorems or the likes, the idea basically being that you put on your socks first then the shoes, but to do the reverse you must get off the shoes first and then the socks.






            share|cite|improve this answer









            $endgroup$




















              1












              $begingroup$

              $hspace3cm$enter image description here



              Turn (transpose) to the street $B^T$ perpendicular $B$, then turn (transpose) to $A^T$ perpendicular $A$.






              share|cite|improve this answer









              $endgroup$











                protected by Zev Chonoles May 19 '15 at 18:54



                Thank you for your interest in this question.
                Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                Would you like to answer one of these unanswered questions instead?














                11 Answers
                11






                active

                oldest

                votes








                11 Answers
                11






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                196












                $begingroup$

                One of my best college math professor always said:




                Make a drawing first.




                enter image description here



                Although, he couldn't have made this one on the blackboard.






                share|cite|improve this answer









                $endgroup$








                • 4




                  $begingroup$
                  This is an absolutely beautiful explanation.
                  $endgroup$
                  – Roger Burt
                  May 13 '15 at 19:32






                • 1




                  $begingroup$
                  Couldn't believe it can be explained in such a simple and fun way. Must plus one!
                  $endgroup$
                  – Vim
                  May 14 '15 at 9:00






                • 3




                  $begingroup$
                  This is the most trite answer i have ever seen. I can't decide whether it deserves an upvote or a downvote.
                  $endgroup$
                  – enthdegree
                  May 16 '15 at 9:16







                • 3




                  $begingroup$
                  (+1) @enthdegree: An upvote, particularly if the drawing were augmented to indicate the $i$th row of $A$, the $j$th column of $B$, and the $(i, j)$ entry of $AB$ (all in marker heavy enough to bleed through the paper, of course), so that when the paper is flipped the OP's question is immediately answered with justification. :)
                  $endgroup$
                  – Andrew D. Hwang
                  May 16 '15 at 11:56






                • 2




                  $begingroup$
                  Sparsity in this drawing is by design. The more clutter you add, the more confuse it becomes. Readers familiar with matrix multiplication will probably have drawn this two-faced $(i,j)$ double-line in their mind.
                  $endgroup$
                  – mdup
                  May 17 '15 at 10:51
















                196












                $begingroup$

                One of my best college math professor always said:




                Make a drawing first.




                enter image description here



                Although, he couldn't have made this one on the blackboard.






                share|cite|improve this answer









                $endgroup$








                • 4




                  $begingroup$
                  This is an absolutely beautiful explanation.
                  $endgroup$
                  – Roger Burt
                  May 13 '15 at 19:32






                • 1




                  $begingroup$
                  Couldn't believe it can be explained in such a simple and fun way. Must plus one!
                  $endgroup$
                  – Vim
                  May 14 '15 at 9:00






                • 3




                  $begingroup$
                  This is the most trite answer i have ever seen. I can't decide whether it deserves an upvote or a downvote.
                  $endgroup$
                  – enthdegree
                  May 16 '15 at 9:16







                • 3




                  $begingroup$
                  (+1) @enthdegree: An upvote, particularly if the drawing were augmented to indicate the $i$th row of $A$, the $j$th column of $B$, and the $(i, j)$ entry of $AB$ (all in marker heavy enough to bleed through the paper, of course), so that when the paper is flipped the OP's question is immediately answered with justification. :)
                  $endgroup$
                  – Andrew D. Hwang
                  May 16 '15 at 11:56






                • 2




                  $begingroup$
                  Sparsity in this drawing is by design. The more clutter you add, the more confuse it becomes. Readers familiar with matrix multiplication will probably have drawn this two-faced $(i,j)$ double-line in their mind.
                  $endgroup$
                  – mdup
                  May 17 '15 at 10:51














                196












                196








                196





                $begingroup$

                One of my best college math professor always said:




                Make a drawing first.




                enter image description here



                Although, he couldn't have made this one on the blackboard.






                share|cite|improve this answer









                $endgroup$



                One of my best college math professor always said:




                Make a drawing first.




                enter image description here



                Although, he couldn't have made this one on the blackboard.







                share|cite|improve this answer












                share|cite|improve this answer



                share|cite|improve this answer










                answered May 13 '15 at 13:03









                mdupmdup

                1,606166




                1,606166







                • 4




                  $begingroup$
                  This is an absolutely beautiful explanation.
                  $endgroup$
                  – Roger Burt
                  May 13 '15 at 19:32






                • 1




                  $begingroup$
                  Couldn't believe it can be explained in such a simple and fun way. Must plus one!
                  $endgroup$
                  – Vim
                  May 14 '15 at 9:00






                • 3




                  $begingroup$
                  This is the most trite answer i have ever seen. I can't decide whether it deserves an upvote or a downvote.
                  $endgroup$
                  – enthdegree
                  May 16 '15 at 9:16







                • 3




                  $begingroup$
                  (+1) @enthdegree: An upvote, particularly if the drawing were augmented to indicate the $i$th row of $A$, the $j$th column of $B$, and the $(i, j)$ entry of $AB$ (all in marker heavy enough to bleed through the paper, of course), so that when the paper is flipped the OP's question is immediately answered with justification. :)
                  $endgroup$
                  – Andrew D. Hwang
                  May 16 '15 at 11:56






                • 2




                  $begingroup$
                  Sparsity in this drawing is by design. The more clutter you add, the more confuse it becomes. Readers familiar with matrix multiplication will probably have drawn this two-faced $(i,j)$ double-line in their mind.
                  $endgroup$
                  – mdup
                  May 17 '15 at 10:51













                • 4




                  $begingroup$
                  This is an absolutely beautiful explanation.
                  $endgroup$
                  – Roger Burt
                  May 13 '15 at 19:32






                • 1




                  $begingroup$
                  Couldn't believe it can be explained in such a simple and fun way. Must plus one!
                  $endgroup$
                  – Vim
                  May 14 '15 at 9:00






                • 3




                  $begingroup$
                  This is the most trite answer i have ever seen. I can't decide whether it deserves an upvote or a downvote.
                  $endgroup$
                  – enthdegree
                  May 16 '15 at 9:16







                • 3




                  $begingroup$
                  (+1) @enthdegree: An upvote, particularly if the drawing were augmented to indicate the $i$th row of $A$, the $j$th column of $B$, and the $(i, j)$ entry of $AB$ (all in marker heavy enough to bleed through the paper, of course), so that when the paper is flipped the OP's question is immediately answered with justification. :)
                  $endgroup$
                  – Andrew D. Hwang
                  May 16 '15 at 11:56






                • 2




                  $begingroup$
                  Sparsity in this drawing is by design. The more clutter you add, the more confuse it becomes. Readers familiar with matrix multiplication will probably have drawn this two-faced $(i,j)$ double-line in their mind.
                  $endgroup$
                  – mdup
                  May 17 '15 at 10:51








                4




                4




                $begingroup$
                This is an absolutely beautiful explanation.
                $endgroup$
                – Roger Burt
                May 13 '15 at 19:32




                $begingroup$
                This is an absolutely beautiful explanation.
                $endgroup$
                – Roger Burt
                May 13 '15 at 19:32




                1




                1




                $begingroup$
                Couldn't believe it can be explained in such a simple and fun way. Must plus one!
                $endgroup$
                – Vim
                May 14 '15 at 9:00




                $begingroup$
                Couldn't believe it can be explained in such a simple and fun way. Must plus one!
                $endgroup$
                – Vim
                May 14 '15 at 9:00




                3




                3




                $begingroup$
                This is the most trite answer i have ever seen. I can't decide whether it deserves an upvote or a downvote.
                $endgroup$
                – enthdegree
                May 16 '15 at 9:16





                $begingroup$
                This is the most trite answer i have ever seen. I can't decide whether it deserves an upvote or a downvote.
                $endgroup$
                – enthdegree
                May 16 '15 at 9:16





                3




                3




                $begingroup$
                (+1) @enthdegree: An upvote, particularly if the drawing were augmented to indicate the $i$th row of $A$, the $j$th column of $B$, and the $(i, j)$ entry of $AB$ (all in marker heavy enough to bleed through the paper, of course), so that when the paper is flipped the OP's question is immediately answered with justification. :)
                $endgroup$
                – Andrew D. Hwang
                May 16 '15 at 11:56




                $begingroup$
                (+1) @enthdegree: An upvote, particularly if the drawing were augmented to indicate the $i$th row of $A$, the $j$th column of $B$, and the $(i, j)$ entry of $AB$ (all in marker heavy enough to bleed through the paper, of course), so that when the paper is flipped the OP's question is immediately answered with justification. :)
                $endgroup$
                – Andrew D. Hwang
                May 16 '15 at 11:56




                2




                2




                $begingroup$
                Sparsity in this drawing is by design. The more clutter you add, the more confuse it becomes. Readers familiar with matrix multiplication will probably have drawn this two-faced $(i,j)$ double-line in their mind.
                $endgroup$
                – mdup
                May 17 '15 at 10:51





                $begingroup$
                Sparsity in this drawing is by design. The more clutter you add, the more confuse it becomes. Readers familiar with matrix multiplication will probably have drawn this two-faced $(i,j)$ double-line in their mind.
                $endgroup$
                – mdup
                May 17 '15 at 10:51












                33












                $begingroup$

                By dualizing $AB: V_1stackrelBlongrightarrow V_2stackrelAlongrightarrowV_3$, we have $(AB)^T: V_3^*stackrelA^TlongrightarrowV_2^*stackrelB^TlongrightarrowV_1^*$.



                Edit: $V^*$ is the dual space $textHom(V, mathbbF)$, the vector space of linear transformations from $V$ to its ground field, and if $A: V_1to V_2$ is a linear transformation, then $A^T: V_2^*to V_1^*$ is its dual defined by $A^T(f)=fcirc A$. By abuse of notation, if $A$ is the matrix representation with respect to bases $mathcalB_1$ of $V_1$ and $mathcalB_2$ of $V_2$, then $A^T$ is the matrix representation of the dual map with respect to the dual bases $mathcalB_1^*$ and $mathcalB_2^*$.






                share|cite|improve this answer











                $endgroup$








                • 9




                  $begingroup$
                  In other words: the dualizing functor $V rightarrow V^*$ is contravariant.
                  $endgroup$
                  – Jair Taylor
                  May 13 '15 at 6:47











                • $begingroup$
                  I think you might elaborate though, explaining what $V^*$ is and what $A^T$ means in this context. Otherwise it seems more like a comment than an answer.
                  $endgroup$
                  – Jair Taylor
                  May 13 '15 at 6:50










                • $begingroup$
                  Actually I was trying to make my answer as concise as possible as the OP does not looking for a proof. Sure I could have been more precise.
                  $endgroup$
                  – Alex Fok
                  May 13 '15 at 6:54










                • $begingroup$
                  @AlexFok looks neat, however I don't know what dualizing means. Can you please elaborate?
                  $endgroup$
                  – user1337
                  May 13 '15 at 6:56






                • 4




                  $begingroup$
                  I wish it would be emphasised more in teaching that transposing makes linear operators switch to work on the dual spaces. Many people – at least in science, not sure how it is in maths – aren't aware of this at all; I was never explicitly taught about it and it was a huge a-ha moment when I first found out.
                  $endgroup$
                  – leftaroundabout
                  May 13 '15 at 20:01
















                33












                $begingroup$

                By dualizing $AB: V_1stackrelBlongrightarrow V_2stackrelAlongrightarrowV_3$, we have $(AB)^T: V_3^*stackrelA^TlongrightarrowV_2^*stackrelB^TlongrightarrowV_1^*$.



                Edit: $V^*$ is the dual space $textHom(V, mathbbF)$, the vector space of linear transformations from $V$ to its ground field, and if $A: V_1to V_2$ is a linear transformation, then $A^T: V_2^*to V_1^*$ is its dual defined by $A^T(f)=fcirc A$. By abuse of notation, if $A$ is the matrix representation with respect to bases $mathcalB_1$ of $V_1$ and $mathcalB_2$ of $V_2$, then $A^T$ is the matrix representation of the dual map with respect to the dual bases $mathcalB_1^*$ and $mathcalB_2^*$.






                share|cite|improve this answer











                $endgroup$








                • 9




                  $begingroup$
                  In other words: the dualizing functor $V rightarrow V^*$ is contravariant.
                  $endgroup$
                  – Jair Taylor
                  May 13 '15 at 6:47











                • $begingroup$
                  I think you might elaborate though, explaining what $V^*$ is and what $A^T$ means in this context. Otherwise it seems more like a comment than an answer.
                  $endgroup$
                  – Jair Taylor
                  May 13 '15 at 6:50










                • $begingroup$
                  Actually I was trying to make my answer as concise as possible as the OP does not looking for a proof. Sure I could have been more precise.
                  $endgroup$
                  – Alex Fok
                  May 13 '15 at 6:54










                • $begingroup$
                  @AlexFok looks neat, however I don't know what dualizing means. Can you please elaborate?
                  $endgroup$
                  – user1337
                  May 13 '15 at 6:56






                • 4




                  $begingroup$
                  I wish it would be emphasised more in teaching that transposing makes linear operators switch to work on the dual spaces. Many people – at least in science, not sure how it is in maths – aren't aware of this at all; I was never explicitly taught about it and it was a huge a-ha moment when I first found out.
                  $endgroup$
                  – leftaroundabout
                  May 13 '15 at 20:01














                33












                33








                33





                $begingroup$

                By dualizing $AB: V_1stackrelBlongrightarrow V_2stackrelAlongrightarrowV_3$, we have $(AB)^T: V_3^*stackrelA^TlongrightarrowV_2^*stackrelB^TlongrightarrowV_1^*$.



                Edit: $V^*$ is the dual space $textHom(V, mathbbF)$, the vector space of linear transformations from $V$ to its ground field, and if $A: V_1to V_2$ is a linear transformation, then $A^T: V_2^*to V_1^*$ is its dual defined by $A^T(f)=fcirc A$. By abuse of notation, if $A$ is the matrix representation with respect to bases $mathcalB_1$ of $V_1$ and $mathcalB_2$ of $V_2$, then $A^T$ is the matrix representation of the dual map with respect to the dual bases $mathcalB_1^*$ and $mathcalB_2^*$.






                share|cite|improve this answer











                $endgroup$



                By dualizing $AB: V_1stackrelBlongrightarrow V_2stackrelAlongrightarrowV_3$, we have $(AB)^T: V_3^*stackrelA^TlongrightarrowV_2^*stackrelB^TlongrightarrowV_1^*$.



                Edit: $V^*$ is the dual space $textHom(V, mathbbF)$, the vector space of linear transformations from $V$ to its ground field, and if $A: V_1to V_2$ is a linear transformation, then $A^T: V_2^*to V_1^*$ is its dual defined by $A^T(f)=fcirc A$. By abuse of notation, if $A$ is the matrix representation with respect to bases $mathcalB_1$ of $V_1$ and $mathcalB_2$ of $V_2$, then $A^T$ is the matrix representation of the dual map with respect to the dual bases $mathcalB_1^*$ and $mathcalB_2^*$.







                share|cite|improve this answer














                share|cite|improve this answer



                share|cite|improve this answer








                edited May 13 '15 at 7:10

























                answered May 13 '15 at 6:46









                Alex FokAlex Fok

                3,866818




                3,866818







                • 9




                  $begingroup$
                  In other words: the dualizing functor $V rightarrow V^*$ is contravariant.
                  $endgroup$
                  – Jair Taylor
                  May 13 '15 at 6:47











                • $begingroup$
                  I think you might elaborate though, explaining what $V^*$ is and what $A^T$ means in this context. Otherwise it seems more like a comment than an answer.
                  $endgroup$
                  – Jair Taylor
                  May 13 '15 at 6:50










                • $begingroup$
                  Actually I was trying to make my answer as concise as possible as the OP does not looking for a proof. Sure I could have been more precise.
                  $endgroup$
                  – Alex Fok
                  May 13 '15 at 6:54










                • $begingroup$
                  @AlexFok looks neat, however I don't know what dualizing means. Can you please elaborate?
                  $endgroup$
                  – user1337
                  May 13 '15 at 6:56






                • 4




                  $begingroup$
                  I wish it would be emphasised more in teaching that transposing makes linear operators switch to work on the dual spaces. Many people – at least in science, not sure how it is in maths – aren't aware of this at all; I was never explicitly taught about it and it was a huge a-ha moment when I first found out.
                  $endgroup$
                  – leftaroundabout
                  May 13 '15 at 20:01













                • 9




                  $begingroup$
                  In other words: the dualizing functor $V rightarrow V^*$ is contravariant.
                  $endgroup$
                  – Jair Taylor
                  May 13 '15 at 6:47











                • $begingroup$
                  I think you might elaborate though, explaining what $V^*$ is and what $A^T$ means in this context. Otherwise it seems more like a comment than an answer.
                  $endgroup$
                  – Jair Taylor
                  May 13 '15 at 6:50










                • $begingroup$
                  Actually I was trying to make my answer as concise as possible as the OP does not looking for a proof. Sure I could have been more precise.
                  $endgroup$
                  – Alex Fok
                  May 13 '15 at 6:54










                • $begingroup$
                  @AlexFok looks neat, however I don't know what dualizing means. Can you please elaborate?
                  $endgroup$
                  – user1337
                  May 13 '15 at 6:56






                • 4




                  $begingroup$
                  I wish it would be emphasised more in teaching that transposing makes linear operators switch to work on the dual spaces. Many people – at least in science, not sure how it is in maths – aren't aware of this at all; I was never explicitly taught about it and it was a huge a-ha moment when I first found out.
                  $endgroup$
                  – leftaroundabout
                  May 13 '15 at 20:01








                9




                9




                $begingroup$
                In other words: the dualizing functor $V rightarrow V^*$ is contravariant.
                $endgroup$
                – Jair Taylor
                May 13 '15 at 6:47





                $begingroup$
                In other words: the dualizing functor $V rightarrow V^*$ is contravariant.
                $endgroup$
                – Jair Taylor
                May 13 '15 at 6:47













                $begingroup$
                I think you might elaborate though, explaining what $V^*$ is and what $A^T$ means in this context. Otherwise it seems more like a comment than an answer.
                $endgroup$
                – Jair Taylor
                May 13 '15 at 6:50




                $begingroup$
                I think you might elaborate though, explaining what $V^*$ is and what $A^T$ means in this context. Otherwise it seems more like a comment than an answer.
                $endgroup$
                – Jair Taylor
                May 13 '15 at 6:50












                $begingroup$
                Actually I was trying to make my answer as concise as possible as the OP does not looking for a proof. Sure I could have been more precise.
                $endgroup$
                – Alex Fok
                May 13 '15 at 6:54




                $begingroup$
                Actually I was trying to make my answer as concise as possible as the OP does not looking for a proof. Sure I could have been more precise.
                $endgroup$
                – Alex Fok
                May 13 '15 at 6:54












                $begingroup$
                @AlexFok looks neat, however I don't know what dualizing means. Can you please elaborate?
                $endgroup$
                – user1337
                May 13 '15 at 6:56




                $begingroup$
                @AlexFok looks neat, however I don't know what dualizing means. Can you please elaborate?
                $endgroup$
                – user1337
                May 13 '15 at 6:56




                4




                4




                $begingroup$
                I wish it would be emphasised more in teaching that transposing makes linear operators switch to work on the dual spaces. Many people – at least in science, not sure how it is in maths – aren't aware of this at all; I was never explicitly taught about it and it was a huge a-ha moment when I first found out.
                $endgroup$
                – leftaroundabout
                May 13 '15 at 20:01





                $begingroup$
                I wish it would be emphasised more in teaching that transposing makes linear operators switch to work on the dual spaces. Many people – at least in science, not sure how it is in maths – aren't aware of this at all; I was never explicitly taught about it and it was a huge a-ha moment when I first found out.
                $endgroup$
                – leftaroundabout
                May 13 '15 at 20:01












                17












                $begingroup$

                Here's another argument. First note that if $v$ is a column vector then $(Mv)^T = v^T M^T$. This is not hard to see - if you write down an example and do it both ways, you will see you are just doing the same computation with a different notation. Multiplying the column vector $v$ on the right by the rows of $M$ is the same as multiplying the row vector $v^T$ on the left by the columns of $M^T$.



                Now let $( cdot , cdot )$ be the usual inner product on $mathbbR^n$, that is, the dot product. Then the transpose $N = M^T$ of a matrix $M$ is the unique matrix $N$ with the property



                $$(Mu, v) = (u, Nv).$$



                This is just a consequence of associativity of matrix multiplication. The dot product of vectors $u,v$ is given by thinking of $u,v$ as column vectors, taking the transpose of one and doing the dot product: $(u,v) = u^T v$.



                Then $(Mu,v) = (Mu)^T v = (u^T M^T) v = u^T (M^Tv) = (u, M^Tv)$.



                Exercise: Show uniqueness!



                With this alternate definition we can give a shoes-and-socks argument. We have



                $$( ABu, v) = (Bu, A^Tv) = (u, B^TA^Tv)$$



                for all $u,v$, and so $(AB)^T = B^T A^T$. The argument is exactly the same as the one for inverses, except we are "moving across the inner product" instead of "undoing".






                share|cite|improve this answer









                $endgroup$








                • 3




                  $begingroup$
                  The second part is the best way of looking at this. The point is that the transpose is not really that natural of an operation by itself: it is important because it is the adjoint operation for the (real) Euclidean dot product. And the adjoint operation for any inner product has the property in question, for the same reason that the inverse operation has the property in question.
                  $endgroup$
                  – Ian
                  May 13 '15 at 15:22
















                17












                $begingroup$

                Here's another argument. First note that if $v$ is a column vector then $(Mv)^T = v^T M^T$. This is not hard to see - if you write down an example and do it both ways, you will see you are just doing the same computation with a different notation. Multiplying the column vector $v$ on the right by the rows of $M$ is the same as multiplying the row vector $v^T$ on the left by the columns of $M^T$.



                Now let $( cdot , cdot )$ be the usual inner product on $mathbbR^n$, that is, the dot product. Then the transpose $N = M^T$ of a matrix $M$ is the unique matrix $N$ with the property



                $$(Mu, v) = (u, Nv).$$



                This is just a consequence of associativity of matrix multiplication. The dot product of vectors $u,v$ is given by thinking of $u,v$ as column vectors, taking the transpose of one and doing the dot product: $(u,v) = u^T v$.



                Then $(Mu,v) = (Mu)^T v = (u^T M^T) v = u^T (M^Tv) = (u, M^Tv)$.



                Exercise: Show uniqueness!



                With this alternate definition we can give a shoes-and-socks argument. We have



                $$( ABu, v) = (Bu, A^Tv) = (u, B^TA^Tv)$$



                for all $u,v$, and so $(AB)^T = B^T A^T$. The argument is exactly the same as the one for inverses, except we are "moving across the inner product" instead of "undoing".






                share|cite|improve this answer









                $endgroup$








                • 3




                  $begingroup$
                  The second part is the best way of looking at this. The point is that the transpose is not really that natural of an operation by itself: it is important because it is the adjoint operation for the (real) Euclidean dot product. And the adjoint operation for any inner product has the property in question, for the same reason that the inverse operation has the property in question.
                  $endgroup$
                  – Ian
                  May 13 '15 at 15:22














                17












                17








                17





                $begingroup$

                Here's another argument. First note that if $v$ is a column vector then $(Mv)^T = v^T M^T$. This is not hard to see - if you write down an example and do it both ways, you will see you are just doing the same computation with a different notation. Multiplying the column vector $v$ on the right by the rows of $M$ is the same as multiplying the row vector $v^T$ on the left by the columns of $M^T$.



                Now let $( cdot , cdot )$ be the usual inner product on $mathbbR^n$, that is, the dot product. Then the transpose $N = M^T$ of a matrix $M$ is the unique matrix $N$ with the property



                $$(Mu, v) = (u, Nv).$$



                This is just a consequence of associativity of matrix multiplication. The dot product of vectors $u,v$ is given by thinking of $u,v$ as column vectors, taking the transpose of one and doing the dot product: $(u,v) = u^T v$.



                Then $(Mu,v) = (Mu)^T v = (u^T M^T) v = u^T (M^Tv) = (u, M^Tv)$.



                Exercise: Show uniqueness!



                With this alternate definition we can give a shoes-and-socks argument. We have



                $$( ABu, v) = (Bu, A^Tv) = (u, B^TA^Tv)$$



                for all $u,v$, and so $(AB)^T = B^T A^T$. The argument is exactly the same as the one for inverses, except we are "moving across the inner product" instead of "undoing".






                share|cite|improve this answer









                $endgroup$



                Here's another argument. First note that if $v$ is a column vector then $(Mv)^T = v^T M^T$. This is not hard to see - if you write down an example and do it both ways, you will see you are just doing the same computation with a different notation. Multiplying the column vector $v$ on the right by the rows of $M$ is the same as multiplying the row vector $v^T$ on the left by the columns of $M^T$.



                Now let $( cdot , cdot )$ be the usual inner product on $mathbbR^n$, that is, the dot product. Then the transpose $N = M^T$ of a matrix $M$ is the unique matrix $N$ with the property



                $$(Mu, v) = (u, Nv).$$



                This is just a consequence of associativity of matrix multiplication. The dot product of vectors $u,v$ is given by thinking of $u,v$ as column vectors, taking the transpose of one and doing the dot product: $(u,v) = u^T v$.



                Then $(Mu,v) = (Mu)^T v = (u^T M^T) v = u^T (M^Tv) = (u, M^Tv)$.



                Exercise: Show uniqueness!



                With this alternate definition we can give a shoes-and-socks argument. We have



                $$( ABu, v) = (Bu, A^Tv) = (u, B^TA^Tv)$$



                for all $u,v$, and so $(AB)^T = B^T A^T$. The argument is exactly the same as the one for inverses, except we are "moving across the inner product" instead of "undoing".







                share|cite|improve this answer












                share|cite|improve this answer



                share|cite|improve this answer










                answered May 13 '15 at 7:50









                Jair TaylorJair Taylor

                9,19432244




                9,19432244







                • 3




                  $begingroup$
                  The second part is the best way of looking at this. The point is that the transpose is not really that natural of an operation by itself: it is important because it is the adjoint operation for the (real) Euclidean dot product. And the adjoint operation for any inner product has the property in question, for the same reason that the inverse operation has the property in question.
                  $endgroup$
                  – Ian
                  May 13 '15 at 15:22













                • 3




                  $begingroup$
                  The second part is the best way of looking at this. The point is that the transpose is not really that natural of an operation by itself: it is important because it is the adjoint operation for the (real) Euclidean dot product. And the adjoint operation for any inner product has the property in question, for the same reason that the inverse operation has the property in question.
                  $endgroup$
                  – Ian
                  May 13 '15 at 15:22








                3




                3




                $begingroup$
                The second part is the best way of looking at this. The point is that the transpose is not really that natural of an operation by itself: it is important because it is the adjoint operation for the (real) Euclidean dot product. And the adjoint operation for any inner product has the property in question, for the same reason that the inverse operation has the property in question.
                $endgroup$
                – Ian
                May 13 '15 at 15:22





                $begingroup$
                The second part is the best way of looking at this. The point is that the transpose is not really that natural of an operation by itself: it is important because it is the adjoint operation for the (real) Euclidean dot product. And the adjoint operation for any inner product has the property in question, for the same reason that the inverse operation has the property in question.
                $endgroup$
                – Ian
                May 13 '15 at 15:22












                7












                $begingroup$

                A matrix is a collection of entries that may be represented with 2 indices. When we multiply two matrices, each resultant entry is the sum of the products



                $$C_ik = sum_j A_ij B_jk $$



                Crucially, the 'middle' index, $j$, must be the same for both matrices (the first must be as wide as the second is tall).



                A transpose is just a reversal of indices:



                $$A_ij^T = A_ji$$



                It should now go without saying that



                $$C_ik^T = C_ki = (sum_j A_ij B_jk)^T = sum_j B_kj A_ji$$



                Memory shortcut: multiplication fails immediately for non-square matrices when you forget to commute for a transpose.






                share|cite|improve this answer









                $endgroup$

















                  7












                  $begingroup$

                  A matrix is a collection of entries that may be represented with 2 indices. When we multiply two matrices, each resultant entry is the sum of the products



                  $$C_ik = sum_j A_ij B_jk $$



                  Crucially, the 'middle' index, $j$, must be the same for both matrices (the first must be as wide as the second is tall).



                  A transpose is just a reversal of indices:



                  $$A_ij^T = A_ji$$



                  It should now go without saying that



                  $$C_ik^T = C_ki = (sum_j A_ij B_jk)^T = sum_j B_kj A_ji$$



                  Memory shortcut: multiplication fails immediately for non-square matrices when you forget to commute for a transpose.






                  share|cite|improve this answer









                  $endgroup$















                    7












                    7








                    7





                    $begingroup$

                    A matrix is a collection of entries that may be represented with 2 indices. When we multiply two matrices, each resultant entry is the sum of the products



                    $$C_ik = sum_j A_ij B_jk $$



                    Crucially, the 'middle' index, $j$, must be the same for both matrices (the first must be as wide as the second is tall).



                    A transpose is just a reversal of indices:



                    $$A_ij^T = A_ji$$



                    It should now go without saying that



                    $$C_ik^T = C_ki = (sum_j A_ij B_jk)^T = sum_j B_kj A_ji$$



                    Memory shortcut: multiplication fails immediately for non-square matrices when you forget to commute for a transpose.






                    share|cite|improve this answer









                    $endgroup$



                    A matrix is a collection of entries that may be represented with 2 indices. When we multiply two matrices, each resultant entry is the sum of the products



                    $$C_ik = sum_j A_ij B_jk $$



                    Crucially, the 'middle' index, $j$, must be the same for both matrices (the first must be as wide as the second is tall).



                    A transpose is just a reversal of indices:



                    $$A_ij^T = A_ji$$



                    It should now go without saying that



                    $$C_ik^T = C_ki = (sum_j A_ij B_jk)^T = sum_j B_kj A_ji$$



                    Memory shortcut: multiplication fails immediately for non-square matrices when you forget to commute for a transpose.







                    share|cite|improve this answer












                    share|cite|improve this answer



                    share|cite|improve this answer










                    answered May 13 '15 at 19:48









                    user121330user121330

                    64549




                    64549





















                        6












                        $begingroup$

                        Each element of the matrix $AB$ is the inner product of a row of $A$ with a column of $B$.



                        $(AB)^T$ has the same elements that $AB$ does (just in different places), so its elements too must each come from a row of $A$ and a column of $B$.



                        However if we want to start with $A^T$ and $B^T$, then a row of $A$ is the same thing as a column of $A^T$ (and vice versa for $B$ and $B^T$), so we need something that has columns of $A^T$ and rows of $B^T$. The matrix that we take columns from is always the right factor, to $A^T$ must be the right factor in the multiplication.



                        Similarly, $B^T$ must be the left factor because we need its rows (which are columns of the original $B$).






                        share|cite|improve this answer









                        $endgroup$

















                          6












                          $begingroup$

                          Each element of the matrix $AB$ is the inner product of a row of $A$ with a column of $B$.



                          $(AB)^T$ has the same elements that $AB$ does (just in different places), so its elements too must each come from a row of $A$ and a column of $B$.



                          However if we want to start with $A^T$ and $B^T$, then a row of $A$ is the same thing as a column of $A^T$ (and vice versa for $B$ and $B^T$), so we need something that has columns of $A^T$ and rows of $B^T$. The matrix that we take columns from is always the right factor, to $A^T$ must be the right factor in the multiplication.



                          Similarly, $B^T$ must be the left factor because we need its rows (which are columns of the original $B$).






                          share|cite|improve this answer









                          $endgroup$















                            6












                            6








                            6





                            $begingroup$

                            Each element of the matrix $AB$ is the inner product of a row of $A$ with a column of $B$.



                            $(AB)^T$ has the same elements that $AB$ does (just in different places), so its elements too must each come from a row of $A$ and a column of $B$.



                            However if we want to start with $A^T$ and $B^T$, then a row of $A$ is the same thing as a column of $A^T$ (and vice versa for $B$ and $B^T$), so we need something that has columns of $A^T$ and rows of $B^T$. The matrix that we take columns from is always the right factor, to $A^T$ must be the right factor in the multiplication.



                            Similarly, $B^T$ must be the left factor because we need its rows (which are columns of the original $B$).






                            share|cite|improve this answer









                            $endgroup$



                            Each element of the matrix $AB$ is the inner product of a row of $A$ with a column of $B$.



                            $(AB)^T$ has the same elements that $AB$ does (just in different places), so its elements too must each come from a row of $A$ and a column of $B$.



                            However if we want to start with $A^T$ and $B^T$, then a row of $A$ is the same thing as a column of $A^T$ (and vice versa for $B$ and $B^T$), so we need something that has columns of $A^T$ and rows of $B^T$. The matrix that we take columns from is always the right factor, to $A^T$ must be the right factor in the multiplication.



                            Similarly, $B^T$ must be the left factor because we need its rows (which are columns of the original $B$).







                            share|cite|improve this answer












                            share|cite|improve this answer



                            share|cite|improve this answer










                            answered May 14 '15 at 19:19









                            Henning MakholmHenning Makholm

                            243k17310554




                            243k17310554





















                                5












                                $begingroup$

                                the intuitive reason is that the entries of a product matrix are feynman path integrals, and transposing the matrixes corresponds simply to reversing the arrow of time for traveling along the paths.



                                (so it's practically the same idea as in your shoes-and-socks example: matrix transposition is about time-reversal, just like function inversion is about time-reversal.)



                                the (i,k)th entry in a product matrix ab is the sum over j of a(i,j).b(j,k). in other words, it's a sum over all "2-step paths" (i,j,k) from i to k, each path visiting one intermediate point j on its way from i to k.



                                this sum over paths is called a "feynman path integral". if you read feynman's original paper on the subject, focusing on the parts that are easy to understand, you'll see that that was feynman's basic message: that whenever you have a big long string of matrixes to multiply, each entry in the product matrix is a "sum over paths" aka "path integral", with the contribution of each particular path being a long product of "transition quantities", each associated with one transition-step along the path.



                                this "path" interpretation of matrix multiplication actually gets more intuitive for longer strings of matrixes, because then each path consists of many steps. for example each entry of a matrix product abc...z is a sum over 26-step paths; each path visits 27 points but with just 26 transition-steps from one point to the next.






                                share|cite|improve this answer









                                $endgroup$








                                • 3




                                  $begingroup$
                                  I very much doubt that this explanation will help the OP, but I found the analogy to Feynman path integrals told me something important about the path integrals. I'm not a physicist and never looked past a paragraph or two about them. Now I can see that they resemble counting paths in a graph by looking at powers of the adjacency matrix.
                                  $endgroup$
                                  – Ethan Bolker
                                  May 14 '15 at 14:45







                                • 2




                                  $begingroup$
                                  I don't think we really need to attach Feynman's name here, but in general this combinatorial view of matrix multiplication as a sum over walks is very helpful. The time-reversal interpretation.is a pretty useful way of looking at the transpose.
                                  $endgroup$
                                  – Jair Taylor
                                  May 15 '15 at 1:04















                                5












                                $begingroup$

                                the intuitive reason is that the entries of a product matrix are feynman path integrals, and transposing the matrixes corresponds simply to reversing the arrow of time for traveling along the paths.



                                (so it's practically the same idea as in your shoes-and-socks example: matrix transposition is about time-reversal, just like function inversion is about time-reversal.)



                                the (i,k)th entry in a product matrix ab is the sum over j of a(i,j).b(j,k). in other words, it's a sum over all "2-step paths" (i,j,k) from i to k, each path visiting one intermediate point j on its way from i to k.



                                this sum over paths is called a "feynman path integral". if you read feynman's original paper on the subject, focusing on the parts that are easy to understand, you'll see that that was feynman's basic message: that whenever you have a big long string of matrixes to multiply, each entry in the product matrix is a "sum over paths" aka "path integral", with the contribution of each particular path being a long product of "transition quantities", each associated with one transition-step along the path.



                                this "path" interpretation of matrix multiplication actually gets more intuitive for longer strings of matrixes, because then each path consists of many steps. for example each entry of a matrix product abc...z is a sum over 26-step paths; each path visits 27 points but with just 26 transition-steps from one point to the next.






                                share|cite|improve this answer









                                $endgroup$








                                • 3




                                  $begingroup$
                                  I very much doubt that this explanation will help the OP, but I found the analogy to Feynman path integrals told me something important about the path integrals. I'm not a physicist and never looked past a paragraph or two about them. Now I can see that they resemble counting paths in a graph by looking at powers of the adjacency matrix.
                                  $endgroup$
                                  – Ethan Bolker
                                  May 14 '15 at 14:45







                                • 2




                                  $begingroup$
                                  I don't think we really need to attach Feynman's name here, but in general this combinatorial view of matrix multiplication as a sum over walks is very helpful. The time-reversal interpretation.is a pretty useful way of looking at the transpose.
                                  $endgroup$
                                  – Jair Taylor
                                  May 15 '15 at 1:04













                                5












                                5








                                5





                                $begingroup$

                                the intuitive reason is that the entries of a product matrix are feynman path integrals, and transposing the matrixes corresponds simply to reversing the arrow of time for traveling along the paths.



                                (so it's practically the same idea as in your shoes-and-socks example: matrix transposition is about time-reversal, just like function inversion is about time-reversal.)



                                the (i,k)th entry in a product matrix ab is the sum over j of a(i,j).b(j,k). in other words, it's a sum over all "2-step paths" (i,j,k) from i to k, each path visiting one intermediate point j on its way from i to k.



                                this sum over paths is called a "feynman path integral". if you read feynman's original paper on the subject, focusing on the parts that are easy to understand, you'll see that that was feynman's basic message: that whenever you have a big long string of matrixes to multiply, each entry in the product matrix is a "sum over paths" aka "path integral", with the contribution of each particular path being a long product of "transition quantities", each associated with one transition-step along the path.



                                this "path" interpretation of matrix multiplication actually gets more intuitive for longer strings of matrixes, because then each path consists of many steps. for example each entry of a matrix product abc...z is a sum over 26-step paths; each path visits 27 points but with just 26 transition-steps from one point to the next.






                                share|cite|improve this answer









                                $endgroup$



                                the intuitive reason is that the entries of a product matrix are feynman path integrals, and transposing the matrixes corresponds simply to reversing the arrow of time for traveling along the paths.



                                (so it's practically the same idea as in your shoes-and-socks example: matrix transposition is about time-reversal, just like function inversion is about time-reversal.)



                                the (i,k)th entry in a product matrix ab is the sum over j of a(i,j).b(j,k). in other words, it's a sum over all "2-step paths" (i,j,k) from i to k, each path visiting one intermediate point j on its way from i to k.



                                this sum over paths is called a "feynman path integral". if you read feynman's original paper on the subject, focusing on the parts that are easy to understand, you'll see that that was feynman's basic message: that whenever you have a big long string of matrixes to multiply, each entry in the product matrix is a "sum over paths" aka "path integral", with the contribution of each particular path being a long product of "transition quantities", each associated with one transition-step along the path.



                                this "path" interpretation of matrix multiplication actually gets more intuitive for longer strings of matrixes, because then each path consists of many steps. for example each entry of a matrix product abc...z is a sum over 26-step paths; each path visits 27 points but with just 26 transition-steps from one point to the next.







                                share|cite|improve this answer












                                share|cite|improve this answer



                                share|cite|improve this answer










                                answered May 13 '15 at 15:29









                                james dolanjames dolan

                                591




                                591







                                • 3




                                  $begingroup$
                                  I very much doubt that this explanation will help the OP, but I found the analogy to Feynman path integrals told me something important about the path integrals. I'm not a physicist and never looked past a paragraph or two about them. Now I can see that they resemble counting paths in a graph by looking at powers of the adjacency matrix.
                                  $endgroup$
                                  – Ethan Bolker
                                  May 14 '15 at 14:45







                                • 2




                                  $begingroup$
                                  I don't think we really need to attach Feynman's name here, but in general this combinatorial view of matrix multiplication as a sum over walks is very helpful. The time-reversal interpretation.is a pretty useful way of looking at the transpose.
                                  $endgroup$
                                  – Jair Taylor
                                  May 15 '15 at 1:04












                                • 3




                                  $begingroup$
                                  I very much doubt that this explanation will help the OP, but I found the analogy to Feynman path integrals told me something important about the path integrals. I'm not a physicist and never looked past a paragraph or two about them. Now I can see that they resemble counting paths in a graph by looking at powers of the adjacency matrix.
                                  $endgroup$
                                  – Ethan Bolker
                                  May 14 '15 at 14:45







                                • 2




                                  $begingroup$
                                  I don't think we really need to attach Feynman's name here, but in general this combinatorial view of matrix multiplication as a sum over walks is very helpful. The time-reversal interpretation.is a pretty useful way of looking at the transpose.
                                  $endgroup$
                                  – Jair Taylor
                                  May 15 '15 at 1:04







                                3




                                3




                                $begingroup$
                                I very much doubt that this explanation will help the OP, but I found the analogy to Feynman path integrals told me something important about the path integrals. I'm not a physicist and never looked past a paragraph or two about them. Now I can see that they resemble counting paths in a graph by looking at powers of the adjacency matrix.
                                $endgroup$
                                – Ethan Bolker
                                May 14 '15 at 14:45





                                $begingroup$
                                I very much doubt that this explanation will help the OP, but I found the analogy to Feynman path integrals told me something important about the path integrals. I'm not a physicist and never looked past a paragraph or two about them. Now I can see that they resemble counting paths in a graph by looking at powers of the adjacency matrix.
                                $endgroup$
                                – Ethan Bolker
                                May 14 '15 at 14:45





                                2




                                2




                                $begingroup$
                                I don't think we really need to attach Feynman's name here, but in general this combinatorial view of matrix multiplication as a sum over walks is very helpful. The time-reversal interpretation.is a pretty useful way of looking at the transpose.
                                $endgroup$
                                – Jair Taylor
                                May 15 '15 at 1:04




                                $begingroup$
                                I don't think we really need to attach Feynman's name here, but in general this combinatorial view of matrix multiplication as a sum over walks is very helpful. The time-reversal interpretation.is a pretty useful way of looking at the transpose.
                                $endgroup$
                                – Jair Taylor
                                May 15 '15 at 1:04











                                3












                                $begingroup$

                                [this is an attempt to combine two previously given answers, mdup's video demo and my "path-sum" story, so it might help to refer to those.]



                                after watching mdup's video demo i started wondering how it relates to the "path-sum" interpretation of matrix multiplication. the key seems to be that mdup's hand-drawn picture of the matrix product AB wants to be folded up to form the visible faces of an oblong box whose three dimensions correspond precisely to the points i, j, and k in a three-point path (i,j,k). this is illustrated by the pairs of pictures below, each pair showing the oblong box first in its folded-up 3-dimensional form and then in its flattened-out 2-dimensional form. in each case the box is held up to a mirror to portray the effect of transposition of matrixes.



                                in the first pair of pictures, the i, j, and k axises are marked, and in the folded-up 3-dimensional form you can see how transposition reverses the order of the axises from i,j,k to k,j,i. in the flattened-out 2-dimensional form you can see how it wants to be folded up because the edges marked j are all the same length (and also, because it was folded up like that when i bought the soap).



                                test http://math.ucr.edu/~jdolan/matrix-3d-axises-1.jpg



                                test http://math.ucr.edu/~jdolan/matrix-2d-axises-1.jpg



                                the second pair of pictures indicate how an entry of the product matrix is calculated. in the flattened-out 2-dimensional form, a row of the first matrix is paired with a column of the second matrix, whereas in the folded-up 3-dimensional form, that "row" and that "column" actually lie parallel to each other because of the 3d arrangement.



                                test http://math.ucr.edu/~jdolan/matrix-3d-dotproduct-2.jpg



                                test http://math.ucr.edu/~jdolan/matrix-2d-dotproduct-1.jpg



                                in other words, each 3-point path (i,j,k) corresponds to a location inside the box, and at that location you write down (using a 3-dimensional printer or else just writing on the air) the product of the transition-quantities for the two transition-steps in the path, A_[i,j] for the transition-step from i to j and B_[j,k] for the transition-step from j to k. this results in a 3-dimensional matrix of numbers written on the air inside the box, but since the desired matrix product AB is only a 2-dimensional matrix, the 3-dimensional matrix is squashed down to 2-dimensional by summing over the j dimension. this is the path-sum- in order for two paths to contribute to the same path-sum they're required to be in direct competition with each other, beginning at the same origin i and ending at the same destination k, so the only index that we sum over is the intermediate index j.



                                the 3-dimensional folded-up form and the 2-dimensional flattened-out form have each their own advantages and disadvantages. the 3-dimensional folded-up form brings out the path-sums and the 3-dimensional nature of matrix multiplication, while the 2-dimensional flattened-out form is better-adapted to writing the calculation down on 2-dimensional paper (which remains easier than writing on 3-dimensional air even still today).



                                anyway, i'll get off my soapbox for now ...






                                share|cite|improve this answer









                                $endgroup$

















                                  3












                                  $begingroup$

                                  [this is an attempt to combine two previously given answers, mdup's video demo and my "path-sum" story, so it might help to refer to those.]



                                  after watching mdup's video demo i started wondering how it relates to the "path-sum" interpretation of matrix multiplication. the key seems to be that mdup's hand-drawn picture of the matrix product AB wants to be folded up to form the visible faces of an oblong box whose three dimensions correspond precisely to the points i, j, and k in a three-point path (i,j,k). this is illustrated by the pairs of pictures below, each pair showing the oblong box first in its folded-up 3-dimensional form and then in its flattened-out 2-dimensional form. in each case the box is held up to a mirror to portray the effect of transposition of matrixes.



                                  in the first pair of pictures, the i, j, and k axises are marked, and in the folded-up 3-dimensional form you can see how transposition reverses the order of the axises from i,j,k to k,j,i. in the flattened-out 2-dimensional form you can see how it wants to be folded up because the edges marked j are all the same length (and also, because it was folded up like that when i bought the soap).



                                  test http://math.ucr.edu/~jdolan/matrix-3d-axises-1.jpg



                                  test http://math.ucr.edu/~jdolan/matrix-2d-axises-1.jpg



                                  the second pair of pictures indicate how an entry of the product matrix is calculated. in the flattened-out 2-dimensional form, a row of the first matrix is paired with a column of the second matrix, whereas in the folded-up 3-dimensional form, that "row" and that "column" actually lie parallel to each other because of the 3d arrangement.



                                  test http://math.ucr.edu/~jdolan/matrix-3d-dotproduct-2.jpg



                                  test http://math.ucr.edu/~jdolan/matrix-2d-dotproduct-1.jpg



                                  in other words, each 3-point path (i,j,k) corresponds to a location inside the box, and at that location you write down (using a 3-dimensional printer or else just writing on the air) the product of the transition-quantities for the two transition-steps in the path, A_[i,j] for the transition-step from i to j and B_[j,k] for the transition-step from j to k. this results in a 3-dimensional matrix of numbers written on the air inside the box, but since the desired matrix product AB is only a 2-dimensional matrix, the 3-dimensional matrix is squashed down to 2-dimensional by summing over the j dimension. this is the path-sum- in order for two paths to contribute to the same path-sum they're required to be in direct competition with each other, beginning at the same origin i and ending at the same destination k, so the only index that we sum over is the intermediate index j.



                                  the 3-dimensional folded-up form and the 2-dimensional flattened-out form have each their own advantages and disadvantages. the 3-dimensional folded-up form brings out the path-sums and the 3-dimensional nature of matrix multiplication, while the 2-dimensional flattened-out form is better-adapted to writing the calculation down on 2-dimensional paper (which remains easier than writing on 3-dimensional air even still today).



                                  anyway, i'll get off my soapbox for now ...






                                  share|cite|improve this answer









                                  $endgroup$















                                    3












                                    3








                                    3





                                    $begingroup$

                                    [this is an attempt to combine two previously given answers, mdup's video demo and my "path-sum" story, so it might help to refer to those.]



                                    after watching mdup's video demo i started wondering how it relates to the "path-sum" interpretation of matrix multiplication. the key seems to be that mdup's hand-drawn picture of the matrix product AB wants to be folded up to form the visible faces of an oblong box whose three dimensions correspond precisely to the points i, j, and k in a three-point path (i,j,k). this is illustrated by the pairs of pictures below, each pair showing the oblong box first in its folded-up 3-dimensional form and then in its flattened-out 2-dimensional form. in each case the box is held up to a mirror to portray the effect of transposition of matrixes.



                                    in the first pair of pictures, the i, j, and k axises are marked, and in the folded-up 3-dimensional form you can see how transposition reverses the order of the axises from i,j,k to k,j,i. in the flattened-out 2-dimensional form you can see how it wants to be folded up because the edges marked j are all the same length (and also, because it was folded up like that when i bought the soap).



                                    test http://math.ucr.edu/~jdolan/matrix-3d-axises-1.jpg



                                    test http://math.ucr.edu/~jdolan/matrix-2d-axises-1.jpg



                                    the second pair of pictures indicate how an entry of the product matrix is calculated. in the flattened-out 2-dimensional form, a row of the first matrix is paired with a column of the second matrix, whereas in the folded-up 3-dimensional form, that "row" and that "column" actually lie parallel to each other because of the 3d arrangement.



                                    test http://math.ucr.edu/~jdolan/matrix-3d-dotproduct-2.jpg



                                    test http://math.ucr.edu/~jdolan/matrix-2d-dotproduct-1.jpg



                                    in other words, each 3-point path (i,j,k) corresponds to a location inside the box, and at that location you write down (using a 3-dimensional printer or else just writing on the air) the product of the transition-quantities for the two transition-steps in the path, A_[i,j] for the transition-step from i to j and B_[j,k] for the transition-step from j to k. this results in a 3-dimensional matrix of numbers written on the air inside the box, but since the desired matrix product AB is only a 2-dimensional matrix, the 3-dimensional matrix is squashed down to 2-dimensional by summing over the j dimension. this is the path-sum- in order for two paths to contribute to the same path-sum they're required to be in direct competition with each other, beginning at the same origin i and ending at the same destination k, so the only index that we sum over is the intermediate index j.



                                    the 3-dimensional folded-up form and the 2-dimensional flattened-out form have each their own advantages and disadvantages. the 3-dimensional folded-up form brings out the path-sums and the 3-dimensional nature of matrix multiplication, while the 2-dimensional flattened-out form is better-adapted to writing the calculation down on 2-dimensional paper (which remains easier than writing on 3-dimensional air even still today).



                                    anyway, i'll get off my soapbox for now ...






                                    share|cite|improve this answer









                                    $endgroup$



                                    [this is an attempt to combine two previously given answers, mdup's video demo and my "path-sum" story, so it might help to refer to those.]



                                    after watching mdup's video demo i started wondering how it relates to the "path-sum" interpretation of matrix multiplication. the key seems to be that mdup's hand-drawn picture of the matrix product AB wants to be folded up to form the visible faces of an oblong box whose three dimensions correspond precisely to the points i, j, and k in a three-point path (i,j,k). this is illustrated by the pairs of pictures below, each pair showing the oblong box first in its folded-up 3-dimensional form and then in its flattened-out 2-dimensional form. in each case the box is held up to a mirror to portray the effect of transposition of matrixes.



                                    in the first pair of pictures, the i, j, and k axises are marked, and in the folded-up 3-dimensional form you can see how transposition reverses the order of the axises from i,j,k to k,j,i. in the flattened-out 2-dimensional form you can see how it wants to be folded up because the edges marked j are all the same length (and also, because it was folded up like that when i bought the soap).



                                    test http://math.ucr.edu/~jdolan/matrix-3d-axises-1.jpg



                                    test http://math.ucr.edu/~jdolan/matrix-2d-axises-1.jpg



                                    the second pair of pictures indicate how an entry of the product matrix is calculated. in the flattened-out 2-dimensional form, a row of the first matrix is paired with a column of the second matrix, whereas in the folded-up 3-dimensional form, that "row" and that "column" actually lie parallel to each other because of the 3d arrangement.



                                    test http://math.ucr.edu/~jdolan/matrix-3d-dotproduct-2.jpg



                                    test http://math.ucr.edu/~jdolan/matrix-2d-dotproduct-1.jpg



                                    in other words, each 3-point path (i,j,k) corresponds to a location inside the box, and at that location you write down (using a 3-dimensional printer or else just writing on the air) the product of the transition-quantities for the two transition-steps in the path, A_[i,j] for the transition-step from i to j and B_[j,k] for the transition-step from j to k. this results in a 3-dimensional matrix of numbers written on the air inside the box, but since the desired matrix product AB is only a 2-dimensional matrix, the 3-dimensional matrix is squashed down to 2-dimensional by summing over the j dimension. this is the path-sum- in order for two paths to contribute to the same path-sum they're required to be in direct competition with each other, beginning at the same origin i and ending at the same destination k, so the only index that we sum over is the intermediate index j.



                                    the 3-dimensional folded-up form and the 2-dimensional flattened-out form have each their own advantages and disadvantages. the 3-dimensional folded-up form brings out the path-sums and the 3-dimensional nature of matrix multiplication, while the 2-dimensional flattened-out form is better-adapted to writing the calculation down on 2-dimensional paper (which remains easier than writing on 3-dimensional air even still today).



                                    anyway, i'll get off my soapbox for now ...







                                    share|cite|improve this answer












                                    share|cite|improve this answer



                                    share|cite|improve this answer










                                    answered May 18 '15 at 3:51









                                    james dolanjames dolan

                                    392




                                    392





















                                        1












                                        $begingroup$

                                        (Short form of Jair Taylor's answer)



                                        In the expresson $v^tA B w$, vectors $v$ and $w$ "see" $A$ and $B$ from different ends, hence in different order.






                                        share|cite|improve this answer









                                        $endgroup$

















                                          1












                                          $begingroup$

                                          (Short form of Jair Taylor's answer)



                                          In the expresson $v^tA B w$, vectors $v$ and $w$ "see" $A$ and $B$ from different ends, hence in different order.






                                          share|cite|improve this answer









                                          $endgroup$















                                            1












                                            1








                                            1





                                            $begingroup$

                                            (Short form of Jair Taylor's answer)



                                            In the expresson $v^tA B w$, vectors $v$ and $w$ "see" $A$ and $B$ from different ends, hence in different order.






                                            share|cite|improve this answer









                                            $endgroup$



                                            (Short form of Jair Taylor's answer)



                                            In the expresson $v^tA B w$, vectors $v$ and $w$ "see" $A$ and $B$ from different ends, hence in different order.







                                            share|cite|improve this answer












                                            share|cite|improve this answer



                                            share|cite|improve this answer










                                            answered May 13 '15 at 15:17









                                            Hagen von EitzenHagen von Eitzen

                                            283k23273508




                                            283k23273508





















                                                1












                                                $begingroup$

                                                Considering the dimensions of the various matrices shows that reversing the order is necessary.



                                                If A is $m times p$ and B is $p times n$,



                                                AB is $m times n$,



                                                (AB)$^T$ is $n times m$



                                                A$^T$ is $p times m$ and B$^T$ is $n times p$



                                                Thus B$^T$A$^T$ has the same dimension as(AB)$^T$






                                                share|cite|improve this answer









                                                $endgroup$

















                                                  1












                                                  $begingroup$

                                                  Considering the dimensions of the various matrices shows that reversing the order is necessary.



                                                  If A is $m times p$ and B is $p times n$,



                                                  AB is $m times n$,



                                                  (AB)$^T$ is $n times m$



                                                  A$^T$ is $p times m$ and B$^T$ is $n times p$



                                                  Thus B$^T$A$^T$ has the same dimension as(AB)$^T$






                                                  share|cite|improve this answer









                                                  $endgroup$















                                                    1












                                                    1








                                                    1





                                                    $begingroup$

                                                    Considering the dimensions of the various matrices shows that reversing the order is necessary.



                                                    If A is $m times p$ and B is $p times n$,



                                                    AB is $m times n$,



                                                    (AB)$^T$ is $n times m$



                                                    A$^T$ is $p times m$ and B$^T$ is $n times p$



                                                    Thus B$^T$A$^T$ has the same dimension as(AB)$^T$






                                                    share|cite|improve this answer









                                                    $endgroup$



                                                    Considering the dimensions of the various matrices shows that reversing the order is necessary.



                                                    If A is $m times p$ and B is $p times n$,



                                                    AB is $m times n$,



                                                    (AB)$^T$ is $n times m$



                                                    A$^T$ is $p times m$ and B$^T$ is $n times p$



                                                    Thus B$^T$A$^T$ has the same dimension as(AB)$^T$







                                                    share|cite|improve this answer












                                                    share|cite|improve this answer



                                                    share|cite|improve this answer










                                                    answered May 19 '15 at 18:52









                                                    John C FrainJohn C Frain

                                                    491




                                                    491





















                                                        1












                                                        $begingroup$

                                                        My professor always called them the shoe-sock theorems or the likes, the idea basically being that you put on your socks first then the shoes, but to do the reverse you must get off the shoes first and then the socks.






                                                        share|cite|improve this answer









                                                        $endgroup$

















                                                          1












                                                          $begingroup$

                                                          My professor always called them the shoe-sock theorems or the likes, the idea basically being that you put on your socks first then the shoes, but to do the reverse you must get off the shoes first and then the socks.






                                                          share|cite|improve this answer









                                                          $endgroup$















                                                            1












                                                            1








                                                            1





                                                            $begingroup$

                                                            My professor always called them the shoe-sock theorems or the likes, the idea basically being that you put on your socks first then the shoes, but to do the reverse you must get off the shoes first and then the socks.






                                                            share|cite|improve this answer









                                                            $endgroup$



                                                            My professor always called them the shoe-sock theorems or the likes, the idea basically being that you put on your socks first then the shoes, but to do the reverse you must get off the shoes first and then the socks.







                                                            share|cite|improve this answer












                                                            share|cite|improve this answer



                                                            share|cite|improve this answer










                                                            answered May 19 '15 at 19:00









                                                            Zelos MalumZelos Malum

                                                            5,5032823




                                                            5,5032823





















                                                                1












                                                                $begingroup$

                                                                $hspace3cm$enter image description here



                                                                Turn (transpose) to the street $B^T$ perpendicular $B$, then turn (transpose) to $A^T$ perpendicular $A$.






                                                                share|cite|improve this answer









                                                                $endgroup$

















                                                                  1












                                                                  $begingroup$

                                                                  $hspace3cm$enter image description here



                                                                  Turn (transpose) to the street $B^T$ perpendicular $B$, then turn (transpose) to $A^T$ perpendicular $A$.






                                                                  share|cite|improve this answer









                                                                  $endgroup$















                                                                    1












                                                                    1








                                                                    1





                                                                    $begingroup$

                                                                    $hspace3cm$enter image description here



                                                                    Turn (transpose) to the street $B^T$ perpendicular $B$, then turn (transpose) to $A^T$ perpendicular $A$.






                                                                    share|cite|improve this answer









                                                                    $endgroup$



                                                                    $hspace3cm$enter image description here



                                                                    Turn (transpose) to the street $B^T$ perpendicular $B$, then turn (transpose) to $A^T$ perpendicular $A$.







                                                                    share|cite|improve this answer












                                                                    share|cite|improve this answer



                                                                    share|cite|improve this answer










                                                                    answered Mar 30 at 12:28









                                                                    farruhotafarruhota

                                                                    21.9k2842




                                                                    21.9k2842















                                                                        protected by Zev Chonoles May 19 '15 at 18:54



                                                                        Thank you for your interest in this question.
                                                                        Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                                                                        Would you like to answer one of these unanswered questions instead?



                                                                        Popular posts from this blog

                                                                        Triangular numbers and gcdProving sum of a set is $0 pmod n$ if $n$ is odd, or $fracn2 pmod n$ if $n$ is even?Is greatest common divisor of two numbers really their smallest linear combination?GCD, LCM RelationshipProve a set of nonnegative integers with greatest common divisor 1 and closed under addition has all but finite many nonnegative integers.all pairs of a and b in an equation containing gcdTriangular Numbers Modulo $k$ - Hit All Values?Understanding the Existence and Uniqueness of the GCDGCD and LCM with logical symbolsThe greatest common divisor of two positive integers less than 100 is equal to 3. Their least common multiple is twelve times one of the integers.Suppose that for all integers $x$, $x|a$ and $x|b$ if and only if $x|c$. Then $c = gcd(a,b)$Which is the gcd of 2 numbers which are multiplied and the result is 600000?

                                                                        Ingelân Ynhâld Etymology | Geografy | Skiednis | Polityk en bestjoer | Ekonomy | Demografy | Kultuer | Klimaat | Sjoch ek | Keppelings om utens | Boarnen, noaten en referinsjes Navigaasjemenuwww.gov.ukOffisjele webside fan it regear fan it Feriene KeninkrykOffisjele webside fan it Britske FerkearsburoNederlânsktalige ynformaasje fan it Britske FerkearsburoOffisjele webside fan English Heritage, de organisaasje dy't him ynset foar it behâld fan it Ingelske kultuergoedYnwennertallen fan alle Britske stêden út 'e folkstelling fan 2011Notes en References, op dizze sideEngland

                                                                        Հադիս Բովանդակություն Անվանում և նշանակություն | Դասակարգում | Աղբյուրներ | Նավարկման ցանկ