Probability of a 3 digits occurence from a sequence of 64 random hexadecimal symbols? The Next CEO of Stack OverflowTold by Professor that this is PIE, but don't see how it's PIE. Help understanding what constitutes the sets, or alternative ways to solve?Probability of finding 2012 before any other occurence of 012 in a random infinite sequence of digits 0,1,2Probability of random integer's digits summing to 12Roulette Probability - Occurence of the Same ValueProbability of co-occurenceProbability that among 3 random digits two different oneOccurence and Position of digits in a combination lockProbability of occurence of alphabetsProbability of Occurence of HEART or EARTHProbability of Next OccurenceProbability of having a first occurence in Poisson random distribution

Which one is the true statement?

Should I tutor a student who I know has cheated on their homework?

The past simple of "gaslight" – "gaslighted" or "gaslit"?

Rotate a column

What was the first Unix version to run on a microcomputer?

How to prove a simple equation?

Would a completely good Muggle be able to use a wand?

Can you be charged for obstruction for refusing to answer questions?

What did we know about the Kessel run before the prequels?

Reference request: Grassmannian and Plucker coordinates in type B, C, D

Is it possible to replace duplicates of a character with one character using tr

How to avoid supervisors with prejudiced views?

How I can get glyphs from a fraktur font and use them as identifiers?

How to invert MapIndexed on a ragged structure? How to construct a tree from rules?

Why did CATV standarize in 75 ohms and everyone else in 50?

What does "Its cash flow is deeply negative" mean?

How to install OpenCV on Raspbian Stretch?

Why didn't Khan get resurrected in the Genesis Explosion?

TikZ: How to reverse arrow direction without switching start/end point?

Help understanding this unsettling image of Titan, Epimetheus, and Saturn's rings?

Flying from Cape Town to England and return to another province

A Man With a Stainless Steel Endoskeleton (like The Terminator) Fighting Cloaked Aliens Only He Can See

How to edit “Name” property in GCI output?

How to delete every two lines after 3rd lines in a file contains very large number of lines?



Probability of a 3 digits occurence from a sequence of 64 random hexadecimal symbols?



The Next CEO of Stack OverflowTold by Professor that this is PIE, but don't see how it's PIE. Help understanding what constitutes the sets, or alternative ways to solve?Probability of finding 2012 before any other occurence of 012 in a random infinite sequence of digits 0,1,2Probability of random integer's digits summing to 12Roulette Probability - Occurence of the Same ValueProbability of co-occurenceProbability that among 3 random digits two different oneOccurence and Position of digits in a combination lockProbability of occurence of alphabetsProbability of Occurence of HEART or EARTHProbability of Next OccurenceProbability of having a first occurence in Poisson random distribution










2












$begingroup$


I have a software test failing from time to time, due to the fact that the string 123 appears when generating a string with SecureRandom.hex(32) (Ruby), which gives a 64 symbols string.

This includes (A-F) and (0-9), so 16 symbols possible.



I have played a little, and I noticed that it can roughly happens 1500 times over 100,000 calls to the function, so around 1.5%.



What would be the formula to have the exact probability of this test failing?



As a developer, I had to share a little bit of code: JS Fiddle.



Edit: Answer's WolframAlpha link



54392126209570846953192790832763093471349103670511891931049735583247364107/
3618502788666131106986593281521497120414687020801267626233049500247285301248
≈0.0150317
≈1.50317%









share|cite|improve this question











$endgroup$











  • $begingroup$
    Should'nt it be (A-F) and 16 symbols?
    $endgroup$
    – Bill O'Haran
    Apr 20 '18 at 13:38











  • $begingroup$
    You are absolutely right, and hex(32) actually returns 64 symbols too :)
    $endgroup$
    – Cyril Gandon
    Apr 20 '18 at 13:42






  • 1




    $begingroup$
    Not an answer, but if you are interested in an estimate, assuming each of the 62 3-digit sequences to be independent gives $1 - (1 - (1/16)^3)^62 approx 1.502%$.
    $endgroup$
    – antkam
    Apr 20 '18 at 13:55
















2












$begingroup$


I have a software test failing from time to time, due to the fact that the string 123 appears when generating a string with SecureRandom.hex(32) (Ruby), which gives a 64 symbols string.

This includes (A-F) and (0-9), so 16 symbols possible.



I have played a little, and I noticed that it can roughly happens 1500 times over 100,000 calls to the function, so around 1.5%.



What would be the formula to have the exact probability of this test failing?



As a developer, I had to share a little bit of code: JS Fiddle.



Edit: Answer's WolframAlpha link



54392126209570846953192790832763093471349103670511891931049735583247364107/
3618502788666131106986593281521497120414687020801267626233049500247285301248
≈0.0150317
≈1.50317%









share|cite|improve this question











$endgroup$











  • $begingroup$
    Should'nt it be (A-F) and 16 symbols?
    $endgroup$
    – Bill O'Haran
    Apr 20 '18 at 13:38











  • $begingroup$
    You are absolutely right, and hex(32) actually returns 64 symbols too :)
    $endgroup$
    – Cyril Gandon
    Apr 20 '18 at 13:42






  • 1




    $begingroup$
    Not an answer, but if you are interested in an estimate, assuming each of the 62 3-digit sequences to be independent gives $1 - (1 - (1/16)^3)^62 approx 1.502%$.
    $endgroup$
    – antkam
    Apr 20 '18 at 13:55














2












2








2


0



$begingroup$


I have a software test failing from time to time, due to the fact that the string 123 appears when generating a string with SecureRandom.hex(32) (Ruby), which gives a 64 symbols string.

This includes (A-F) and (0-9), so 16 symbols possible.



I have played a little, and I noticed that it can roughly happens 1500 times over 100,000 calls to the function, so around 1.5%.



What would be the formula to have the exact probability of this test failing?



As a developer, I had to share a little bit of code: JS Fiddle.



Edit: Answer's WolframAlpha link



54392126209570846953192790832763093471349103670511891931049735583247364107/
3618502788666131106986593281521497120414687020801267626233049500247285301248
≈0.0150317
≈1.50317%









share|cite|improve this question











$endgroup$




I have a software test failing from time to time, due to the fact that the string 123 appears when generating a string with SecureRandom.hex(32) (Ruby), which gives a 64 symbols string.

This includes (A-F) and (0-9), so 16 symbols possible.



I have played a little, and I noticed that it can roughly happens 1500 times over 100,000 calls to the function, so around 1.5%.



What would be the formula to have the exact probability of this test failing?



As a developer, I had to share a little bit of code: JS Fiddle.



Edit: Answer's WolframAlpha link



54392126209570846953192790832763093471349103670511891931049735583247364107/
3618502788666131106986593281521497120414687020801267626233049500247285301248
≈0.0150317
≈1.50317%






probability combinatorics






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Apr 20 '18 at 15:12







Cyril Gandon

















asked Apr 20 '18 at 13:30









Cyril GandonCyril Gandon

1136




1136











  • $begingroup$
    Should'nt it be (A-F) and 16 symbols?
    $endgroup$
    – Bill O'Haran
    Apr 20 '18 at 13:38











  • $begingroup$
    You are absolutely right, and hex(32) actually returns 64 symbols too :)
    $endgroup$
    – Cyril Gandon
    Apr 20 '18 at 13:42






  • 1




    $begingroup$
    Not an answer, but if you are interested in an estimate, assuming each of the 62 3-digit sequences to be independent gives $1 - (1 - (1/16)^3)^62 approx 1.502%$.
    $endgroup$
    – antkam
    Apr 20 '18 at 13:55

















  • $begingroup$
    Should'nt it be (A-F) and 16 symbols?
    $endgroup$
    – Bill O'Haran
    Apr 20 '18 at 13:38











  • $begingroup$
    You are absolutely right, and hex(32) actually returns 64 symbols too :)
    $endgroup$
    – Cyril Gandon
    Apr 20 '18 at 13:42






  • 1




    $begingroup$
    Not an answer, but if you are interested in an estimate, assuming each of the 62 3-digit sequences to be independent gives $1 - (1 - (1/16)^3)^62 approx 1.502%$.
    $endgroup$
    – antkam
    Apr 20 '18 at 13:55
















$begingroup$
Should'nt it be (A-F) and 16 symbols?
$endgroup$
– Bill O'Haran
Apr 20 '18 at 13:38





$begingroup$
Should'nt it be (A-F) and 16 symbols?
$endgroup$
– Bill O'Haran
Apr 20 '18 at 13:38













$begingroup$
You are absolutely right, and hex(32) actually returns 64 symbols too :)
$endgroup$
– Cyril Gandon
Apr 20 '18 at 13:42




$begingroup$
You are absolutely right, and hex(32) actually returns 64 symbols too :)
$endgroup$
– Cyril Gandon
Apr 20 '18 at 13:42




1




1




$begingroup$
Not an answer, but if you are interested in an estimate, assuming each of the 62 3-digit sequences to be independent gives $1 - (1 - (1/16)^3)^62 approx 1.502%$.
$endgroup$
– antkam
Apr 20 '18 at 13:55





$begingroup$
Not an answer, but if you are interested in an estimate, assuming each of the 62 3-digit sequences to be independent gives $1 - (1 - (1/16)^3)^62 approx 1.502%$.
$endgroup$
– antkam
Apr 20 '18 at 13:55











3 Answers
3






active

oldest

votes


















3












$begingroup$

An exact answer, via Inclusion Exclusion Principle:



Let $A_i =$ the set of 64-long strings where "123" appears starting at the $i$th position ($1 le i le 62$). E.g. if "123" appears at both the 2nd and 40th positions in string $x$, then $x in A_2$ and $x in A_40$, i.e. $x in A_2 cap A_40 $.



Then $S = bigcup_i A_i=$ the set of strings where "123" appears at least once, and your desired probability is $|S|/16^64$.



The Inclusion Exclusion Principle gives:



$$|S| = |bigcup A_i| = sum_i |A_i| - sum_i<j |A_i cap A_j| + sum_i<j<k |A_i cap A_j cap A_k| - dots $$



Obviously $|A_i| = 16^61$ as the other 61 positions can be anything. So the first term is $sum_i |A_i| = 62 cdot 16^61$.



Now two occurrences of "123" cannot overlap (unlike e.g. if you're looking for "111" in which case they CAN overlap). So if $j - i < 3$, then $|A_i cap A_j| = 0$. For the non-trivial case of $j - i ge 3$, the other $64 - 3 - 3 = 58$ positions can be filled with anything, so $|A_i cap A_j| = 16^58$.



Meanwhile, how many $(i, j)$ pairs are there s.t. $j - i ge 3$? Imagine "merging" each "123" into a new character "!", so the final string length is now $64-2-2 = 60$ instead and there are two "!" characters. The number of ways to do this is simply $60 choose 2$. Thus the 2nd term is:



$$ sum_i<j |A_i cap A_j| = 60 choose 2 16^58$$



The $m$th term is similar and we have:



$$ sum_i_1 < i_2 < cdots < i_m |A_i_1 cap A_i_2 cap cdots cap A_i_m| = 64 - 2m choose m 16^64 - 3m $$



So the final result is:



$$ |S| = sum_m=1^21 (-1)^m+1 64 - 2m choose m 16^64 - 3m $$






share|cite|improve this answer









$endgroup$












  • $begingroup$
    Awesome, I would have never been able to compute that!
    $endgroup$
    – Cyril Gandon
    Apr 20 '18 at 15:14


















1












$begingroup$

This answer is based upon the Goulden-Jackson Cluster Method.




We consider the set words of length $ngeq 0$ built from the alphabet $mathcalV=0,ldots,9,A,ldots,F$ and the set $B=123$ of bad words, which are not allowed to be part of the words we are looking for.



  • We derive a generating function $f(s)$ with the coefficient of $s^n$ being the number of these words of length $n$.


  • Since we are looking for the number of words which contain the bad word $123$, the resulting generating function is the generating function of all words minus $f(s)$
    beginalign*
    &1+16s+16^2s^2+16^3s^3cdots-f(s)=frac11-16s-f(s)
    endalign*




According to the paper (p.7) the generating function $f(s)$ is
beginalign*
f(s)=frac11-ds-textweight(mathcalC)tag1
endalign*

with $d=|mathcalV|=16$, the size of the alphabet and $mathcalC$ the weight-numerator of bad words with
beginalign*
textweight(mathcalC)=textweight(mathcalC[123])
endalign*



We calculate according to the paper
beginalign*
textweight(mathcalC)=textweight(mathcalC[123])&=-s^3tag2\
endalign*




It follows from (1) and (2)



beginalign*
f(s)=frac11-16s+s^3\
endalign*

and the generated function counting all strings which contain $123$ is
beginalign*
&colorbluefrac11-16s-frac11-16s+s^3=s^3 + colorblue32 s^4 + 768 s^5 + 16,383 s^6 +cdotstag3\
endalign*



The coefficients of the series were calculated with the help of Wolfram Alpha. We see for instance there are $colorblue32$ words of length $4$ which contain the bad word $123$. These are $$(0..9,A..F)123qquadtext and qquad 123(0..9,A..F).$$




The coefficient of $s^n$



In fact we are interested in the coefficient of $s^64$. We calculate the coefficient of $s^n$ from (3). It is convenient to use the coefficient of operator $[s^n]$ to denote the coefficient of $s^n$ of a series.



We obtain from (3) for $ngeq 0$



beginalign*
colorblue[s^n]&colorblueleft(frac11-16s-frac11-16s+s^3right)\
&=[s^n]sum_m=0^infty 16^ms^m-[s^n]sum_m=0^infty s^m(16-s^2)^m\
&=16^n-[s^n]sum_m=0^infty s^msum_k=0^mbinommk(-1)^ks^2k16^m-k\
&=16^n-sum_m=0^n[s^n-m]sum_k=0^mbinommk(-1)^ks^2k16^m-k\
&=16^n-sum_m=0^n[s^m]sum_k=0^n-mbinomn-mk(-1)^ks^2k16^n-m-k\
&=16^n-sum_m=0^lfloor n/2 rfloor [s^2m]sum_k=0^n-2mbinomn-2mk(-1)^ks^2k16^n-2m-k\
&=16^n-sum_m=0^lfloor n/3 rfloorbinomn-2mm(-1)^m16^n-3m\
&,,colorblue=sum_m=1^lfloor n/3 rfloorbinomn-2mm(-1)^m+116^n-3mtag4\
endalign*




Finally we obtain from (4) in accordance with the answer of @antkam the wanted probability



beginalign*
colorblue[s^64]colorblueleft(frac11-16s-frac11-16s+s^3right)16^-64&colorblue=sum_m=1^21binom64-2mm(-1)^m+116^-3m\
&colorbluesimeq 0.01503,16662,40506
endalign*



which gives roughly $colorblue1.5%$.







share|cite|improve this answer











$endgroup$




















    0












    $begingroup$

    Just a quick hint: total number of all possible strings is $16^32$, the number of strings that contain at least one occurrence of "$123$" is $32 cdot 16^29=2cdot 16^30$ ($32$ spots to put substring "$123$" and we need to fill remaining $29$ positions). The probability for truly random generation should be $$frac2cdot 16^3016^32=frac1128$$






    share|cite|improve this answer









    $endgroup$








    • 1




      $begingroup$
      the string length is 64, there are 62 places to put "123", and this argument double-counts the cases where "123" appears twice, etc.
      $endgroup$
      – antkam
      Apr 20 '18 at 14:19











    Your Answer





    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "69"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2746006%2fprobability-of-a-3-digits-occurence-from-a-sequence-of-64-random-hexadecimal-sym%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    3












    $begingroup$

    An exact answer, via Inclusion Exclusion Principle:



    Let $A_i =$ the set of 64-long strings where "123" appears starting at the $i$th position ($1 le i le 62$). E.g. if "123" appears at both the 2nd and 40th positions in string $x$, then $x in A_2$ and $x in A_40$, i.e. $x in A_2 cap A_40 $.



    Then $S = bigcup_i A_i=$ the set of strings where "123" appears at least once, and your desired probability is $|S|/16^64$.



    The Inclusion Exclusion Principle gives:



    $$|S| = |bigcup A_i| = sum_i |A_i| - sum_i<j |A_i cap A_j| + sum_i<j<k |A_i cap A_j cap A_k| - dots $$



    Obviously $|A_i| = 16^61$ as the other 61 positions can be anything. So the first term is $sum_i |A_i| = 62 cdot 16^61$.



    Now two occurrences of "123" cannot overlap (unlike e.g. if you're looking for "111" in which case they CAN overlap). So if $j - i < 3$, then $|A_i cap A_j| = 0$. For the non-trivial case of $j - i ge 3$, the other $64 - 3 - 3 = 58$ positions can be filled with anything, so $|A_i cap A_j| = 16^58$.



    Meanwhile, how many $(i, j)$ pairs are there s.t. $j - i ge 3$? Imagine "merging" each "123" into a new character "!", so the final string length is now $64-2-2 = 60$ instead and there are two "!" characters. The number of ways to do this is simply $60 choose 2$. Thus the 2nd term is:



    $$ sum_i<j |A_i cap A_j| = 60 choose 2 16^58$$



    The $m$th term is similar and we have:



    $$ sum_i_1 < i_2 < cdots < i_m |A_i_1 cap A_i_2 cap cdots cap A_i_m| = 64 - 2m choose m 16^64 - 3m $$



    So the final result is:



    $$ |S| = sum_m=1^21 (-1)^m+1 64 - 2m choose m 16^64 - 3m $$






    share|cite|improve this answer









    $endgroup$












    • $begingroup$
      Awesome, I would have never been able to compute that!
      $endgroup$
      – Cyril Gandon
      Apr 20 '18 at 15:14















    3












    $begingroup$

    An exact answer, via Inclusion Exclusion Principle:



    Let $A_i =$ the set of 64-long strings where "123" appears starting at the $i$th position ($1 le i le 62$). E.g. if "123" appears at both the 2nd and 40th positions in string $x$, then $x in A_2$ and $x in A_40$, i.e. $x in A_2 cap A_40 $.



    Then $S = bigcup_i A_i=$ the set of strings where "123" appears at least once, and your desired probability is $|S|/16^64$.



    The Inclusion Exclusion Principle gives:



    $$|S| = |bigcup A_i| = sum_i |A_i| - sum_i<j |A_i cap A_j| + sum_i<j<k |A_i cap A_j cap A_k| - dots $$



    Obviously $|A_i| = 16^61$ as the other 61 positions can be anything. So the first term is $sum_i |A_i| = 62 cdot 16^61$.



    Now two occurrences of "123" cannot overlap (unlike e.g. if you're looking for "111" in which case they CAN overlap). So if $j - i < 3$, then $|A_i cap A_j| = 0$. For the non-trivial case of $j - i ge 3$, the other $64 - 3 - 3 = 58$ positions can be filled with anything, so $|A_i cap A_j| = 16^58$.



    Meanwhile, how many $(i, j)$ pairs are there s.t. $j - i ge 3$? Imagine "merging" each "123" into a new character "!", so the final string length is now $64-2-2 = 60$ instead and there are two "!" characters. The number of ways to do this is simply $60 choose 2$. Thus the 2nd term is:



    $$ sum_i<j |A_i cap A_j| = 60 choose 2 16^58$$



    The $m$th term is similar and we have:



    $$ sum_i_1 < i_2 < cdots < i_m |A_i_1 cap A_i_2 cap cdots cap A_i_m| = 64 - 2m choose m 16^64 - 3m $$



    So the final result is:



    $$ |S| = sum_m=1^21 (-1)^m+1 64 - 2m choose m 16^64 - 3m $$






    share|cite|improve this answer









    $endgroup$












    • $begingroup$
      Awesome, I would have never been able to compute that!
      $endgroup$
      – Cyril Gandon
      Apr 20 '18 at 15:14













    3












    3








    3





    $begingroup$

    An exact answer, via Inclusion Exclusion Principle:



    Let $A_i =$ the set of 64-long strings where "123" appears starting at the $i$th position ($1 le i le 62$). E.g. if "123" appears at both the 2nd and 40th positions in string $x$, then $x in A_2$ and $x in A_40$, i.e. $x in A_2 cap A_40 $.



    Then $S = bigcup_i A_i=$ the set of strings where "123" appears at least once, and your desired probability is $|S|/16^64$.



    The Inclusion Exclusion Principle gives:



    $$|S| = |bigcup A_i| = sum_i |A_i| - sum_i<j |A_i cap A_j| + sum_i<j<k |A_i cap A_j cap A_k| - dots $$



    Obviously $|A_i| = 16^61$ as the other 61 positions can be anything. So the first term is $sum_i |A_i| = 62 cdot 16^61$.



    Now two occurrences of "123" cannot overlap (unlike e.g. if you're looking for "111" in which case they CAN overlap). So if $j - i < 3$, then $|A_i cap A_j| = 0$. For the non-trivial case of $j - i ge 3$, the other $64 - 3 - 3 = 58$ positions can be filled with anything, so $|A_i cap A_j| = 16^58$.



    Meanwhile, how many $(i, j)$ pairs are there s.t. $j - i ge 3$? Imagine "merging" each "123" into a new character "!", so the final string length is now $64-2-2 = 60$ instead and there are two "!" characters. The number of ways to do this is simply $60 choose 2$. Thus the 2nd term is:



    $$ sum_i<j |A_i cap A_j| = 60 choose 2 16^58$$



    The $m$th term is similar and we have:



    $$ sum_i_1 < i_2 < cdots < i_m |A_i_1 cap A_i_2 cap cdots cap A_i_m| = 64 - 2m choose m 16^64 - 3m $$



    So the final result is:



    $$ |S| = sum_m=1^21 (-1)^m+1 64 - 2m choose m 16^64 - 3m $$






    share|cite|improve this answer









    $endgroup$



    An exact answer, via Inclusion Exclusion Principle:



    Let $A_i =$ the set of 64-long strings where "123" appears starting at the $i$th position ($1 le i le 62$). E.g. if "123" appears at both the 2nd and 40th positions in string $x$, then $x in A_2$ and $x in A_40$, i.e. $x in A_2 cap A_40 $.



    Then $S = bigcup_i A_i=$ the set of strings where "123" appears at least once, and your desired probability is $|S|/16^64$.



    The Inclusion Exclusion Principle gives:



    $$|S| = |bigcup A_i| = sum_i |A_i| - sum_i<j |A_i cap A_j| + sum_i<j<k |A_i cap A_j cap A_k| - dots $$



    Obviously $|A_i| = 16^61$ as the other 61 positions can be anything. So the first term is $sum_i |A_i| = 62 cdot 16^61$.



    Now two occurrences of "123" cannot overlap (unlike e.g. if you're looking for "111" in which case they CAN overlap). So if $j - i < 3$, then $|A_i cap A_j| = 0$. For the non-trivial case of $j - i ge 3$, the other $64 - 3 - 3 = 58$ positions can be filled with anything, so $|A_i cap A_j| = 16^58$.



    Meanwhile, how many $(i, j)$ pairs are there s.t. $j - i ge 3$? Imagine "merging" each "123" into a new character "!", so the final string length is now $64-2-2 = 60$ instead and there are two "!" characters. The number of ways to do this is simply $60 choose 2$. Thus the 2nd term is:



    $$ sum_i<j |A_i cap A_j| = 60 choose 2 16^58$$



    The $m$th term is similar and we have:



    $$ sum_i_1 < i_2 < cdots < i_m |A_i_1 cap A_i_2 cap cdots cap A_i_m| = 64 - 2m choose m 16^64 - 3m $$



    So the final result is:



    $$ |S| = sum_m=1^21 (-1)^m+1 64 - 2m choose m 16^64 - 3m $$







    share|cite|improve this answer












    share|cite|improve this answer



    share|cite|improve this answer










    answered Apr 20 '18 at 14:58









    antkamantkam

    2,587312




    2,587312











    • $begingroup$
      Awesome, I would have never been able to compute that!
      $endgroup$
      – Cyril Gandon
      Apr 20 '18 at 15:14
















    • $begingroup$
      Awesome, I would have never been able to compute that!
      $endgroup$
      – Cyril Gandon
      Apr 20 '18 at 15:14















    $begingroup$
    Awesome, I would have never been able to compute that!
    $endgroup$
    – Cyril Gandon
    Apr 20 '18 at 15:14




    $begingroup$
    Awesome, I would have never been able to compute that!
    $endgroup$
    – Cyril Gandon
    Apr 20 '18 at 15:14











    1












    $begingroup$

    This answer is based upon the Goulden-Jackson Cluster Method.




    We consider the set words of length $ngeq 0$ built from the alphabet $mathcalV=0,ldots,9,A,ldots,F$ and the set $B=123$ of bad words, which are not allowed to be part of the words we are looking for.



    • We derive a generating function $f(s)$ with the coefficient of $s^n$ being the number of these words of length $n$.


    • Since we are looking for the number of words which contain the bad word $123$, the resulting generating function is the generating function of all words minus $f(s)$
      beginalign*
      &1+16s+16^2s^2+16^3s^3cdots-f(s)=frac11-16s-f(s)
      endalign*




    According to the paper (p.7) the generating function $f(s)$ is
    beginalign*
    f(s)=frac11-ds-textweight(mathcalC)tag1
    endalign*

    with $d=|mathcalV|=16$, the size of the alphabet and $mathcalC$ the weight-numerator of bad words with
    beginalign*
    textweight(mathcalC)=textweight(mathcalC[123])
    endalign*



    We calculate according to the paper
    beginalign*
    textweight(mathcalC)=textweight(mathcalC[123])&=-s^3tag2\
    endalign*




    It follows from (1) and (2)



    beginalign*
    f(s)=frac11-16s+s^3\
    endalign*

    and the generated function counting all strings which contain $123$ is
    beginalign*
    &colorbluefrac11-16s-frac11-16s+s^3=s^3 + colorblue32 s^4 + 768 s^5 + 16,383 s^6 +cdotstag3\
    endalign*



    The coefficients of the series were calculated with the help of Wolfram Alpha. We see for instance there are $colorblue32$ words of length $4$ which contain the bad word $123$. These are $$(0..9,A..F)123qquadtext and qquad 123(0..9,A..F).$$




    The coefficient of $s^n$



    In fact we are interested in the coefficient of $s^64$. We calculate the coefficient of $s^n$ from (3). It is convenient to use the coefficient of operator $[s^n]$ to denote the coefficient of $s^n$ of a series.



    We obtain from (3) for $ngeq 0$



    beginalign*
    colorblue[s^n]&colorblueleft(frac11-16s-frac11-16s+s^3right)\
    &=[s^n]sum_m=0^infty 16^ms^m-[s^n]sum_m=0^infty s^m(16-s^2)^m\
    &=16^n-[s^n]sum_m=0^infty s^msum_k=0^mbinommk(-1)^ks^2k16^m-k\
    &=16^n-sum_m=0^n[s^n-m]sum_k=0^mbinommk(-1)^ks^2k16^m-k\
    &=16^n-sum_m=0^n[s^m]sum_k=0^n-mbinomn-mk(-1)^ks^2k16^n-m-k\
    &=16^n-sum_m=0^lfloor n/2 rfloor [s^2m]sum_k=0^n-2mbinomn-2mk(-1)^ks^2k16^n-2m-k\
    &=16^n-sum_m=0^lfloor n/3 rfloorbinomn-2mm(-1)^m16^n-3m\
    &,,colorblue=sum_m=1^lfloor n/3 rfloorbinomn-2mm(-1)^m+116^n-3mtag4\
    endalign*




    Finally we obtain from (4) in accordance with the answer of @antkam the wanted probability



    beginalign*
    colorblue[s^64]colorblueleft(frac11-16s-frac11-16s+s^3right)16^-64&colorblue=sum_m=1^21binom64-2mm(-1)^m+116^-3m\
    &colorbluesimeq 0.01503,16662,40506
    endalign*



    which gives roughly $colorblue1.5%$.







    share|cite|improve this answer











    $endgroup$

















      1












      $begingroup$

      This answer is based upon the Goulden-Jackson Cluster Method.




      We consider the set words of length $ngeq 0$ built from the alphabet $mathcalV=0,ldots,9,A,ldots,F$ and the set $B=123$ of bad words, which are not allowed to be part of the words we are looking for.



      • We derive a generating function $f(s)$ with the coefficient of $s^n$ being the number of these words of length $n$.


      • Since we are looking for the number of words which contain the bad word $123$, the resulting generating function is the generating function of all words minus $f(s)$
        beginalign*
        &1+16s+16^2s^2+16^3s^3cdots-f(s)=frac11-16s-f(s)
        endalign*




      According to the paper (p.7) the generating function $f(s)$ is
      beginalign*
      f(s)=frac11-ds-textweight(mathcalC)tag1
      endalign*

      with $d=|mathcalV|=16$, the size of the alphabet and $mathcalC$ the weight-numerator of bad words with
      beginalign*
      textweight(mathcalC)=textweight(mathcalC[123])
      endalign*



      We calculate according to the paper
      beginalign*
      textweight(mathcalC)=textweight(mathcalC[123])&=-s^3tag2\
      endalign*




      It follows from (1) and (2)



      beginalign*
      f(s)=frac11-16s+s^3\
      endalign*

      and the generated function counting all strings which contain $123$ is
      beginalign*
      &colorbluefrac11-16s-frac11-16s+s^3=s^3 + colorblue32 s^4 + 768 s^5 + 16,383 s^6 +cdotstag3\
      endalign*



      The coefficients of the series were calculated with the help of Wolfram Alpha. We see for instance there are $colorblue32$ words of length $4$ which contain the bad word $123$. These are $$(0..9,A..F)123qquadtext and qquad 123(0..9,A..F).$$




      The coefficient of $s^n$



      In fact we are interested in the coefficient of $s^64$. We calculate the coefficient of $s^n$ from (3). It is convenient to use the coefficient of operator $[s^n]$ to denote the coefficient of $s^n$ of a series.



      We obtain from (3) for $ngeq 0$



      beginalign*
      colorblue[s^n]&colorblueleft(frac11-16s-frac11-16s+s^3right)\
      &=[s^n]sum_m=0^infty 16^ms^m-[s^n]sum_m=0^infty s^m(16-s^2)^m\
      &=16^n-[s^n]sum_m=0^infty s^msum_k=0^mbinommk(-1)^ks^2k16^m-k\
      &=16^n-sum_m=0^n[s^n-m]sum_k=0^mbinommk(-1)^ks^2k16^m-k\
      &=16^n-sum_m=0^n[s^m]sum_k=0^n-mbinomn-mk(-1)^ks^2k16^n-m-k\
      &=16^n-sum_m=0^lfloor n/2 rfloor [s^2m]sum_k=0^n-2mbinomn-2mk(-1)^ks^2k16^n-2m-k\
      &=16^n-sum_m=0^lfloor n/3 rfloorbinomn-2mm(-1)^m16^n-3m\
      &,,colorblue=sum_m=1^lfloor n/3 rfloorbinomn-2mm(-1)^m+116^n-3mtag4\
      endalign*




      Finally we obtain from (4) in accordance with the answer of @antkam the wanted probability



      beginalign*
      colorblue[s^64]colorblueleft(frac11-16s-frac11-16s+s^3right)16^-64&colorblue=sum_m=1^21binom64-2mm(-1)^m+116^-3m\
      &colorbluesimeq 0.01503,16662,40506
      endalign*



      which gives roughly $colorblue1.5%$.







      share|cite|improve this answer











      $endgroup$















        1












        1








        1





        $begingroup$

        This answer is based upon the Goulden-Jackson Cluster Method.




        We consider the set words of length $ngeq 0$ built from the alphabet $mathcalV=0,ldots,9,A,ldots,F$ and the set $B=123$ of bad words, which are not allowed to be part of the words we are looking for.



        • We derive a generating function $f(s)$ with the coefficient of $s^n$ being the number of these words of length $n$.


        • Since we are looking for the number of words which contain the bad word $123$, the resulting generating function is the generating function of all words minus $f(s)$
          beginalign*
          &1+16s+16^2s^2+16^3s^3cdots-f(s)=frac11-16s-f(s)
          endalign*




        According to the paper (p.7) the generating function $f(s)$ is
        beginalign*
        f(s)=frac11-ds-textweight(mathcalC)tag1
        endalign*

        with $d=|mathcalV|=16$, the size of the alphabet and $mathcalC$ the weight-numerator of bad words with
        beginalign*
        textweight(mathcalC)=textweight(mathcalC[123])
        endalign*



        We calculate according to the paper
        beginalign*
        textweight(mathcalC)=textweight(mathcalC[123])&=-s^3tag2\
        endalign*




        It follows from (1) and (2)



        beginalign*
        f(s)=frac11-16s+s^3\
        endalign*

        and the generated function counting all strings which contain $123$ is
        beginalign*
        &colorbluefrac11-16s-frac11-16s+s^3=s^3 + colorblue32 s^4 + 768 s^5 + 16,383 s^6 +cdotstag3\
        endalign*



        The coefficients of the series were calculated with the help of Wolfram Alpha. We see for instance there are $colorblue32$ words of length $4$ which contain the bad word $123$. These are $$(0..9,A..F)123qquadtext and qquad 123(0..9,A..F).$$




        The coefficient of $s^n$



        In fact we are interested in the coefficient of $s^64$. We calculate the coefficient of $s^n$ from (3). It is convenient to use the coefficient of operator $[s^n]$ to denote the coefficient of $s^n$ of a series.



        We obtain from (3) for $ngeq 0$



        beginalign*
        colorblue[s^n]&colorblueleft(frac11-16s-frac11-16s+s^3right)\
        &=[s^n]sum_m=0^infty 16^ms^m-[s^n]sum_m=0^infty s^m(16-s^2)^m\
        &=16^n-[s^n]sum_m=0^infty s^msum_k=0^mbinommk(-1)^ks^2k16^m-k\
        &=16^n-sum_m=0^n[s^n-m]sum_k=0^mbinommk(-1)^ks^2k16^m-k\
        &=16^n-sum_m=0^n[s^m]sum_k=0^n-mbinomn-mk(-1)^ks^2k16^n-m-k\
        &=16^n-sum_m=0^lfloor n/2 rfloor [s^2m]sum_k=0^n-2mbinomn-2mk(-1)^ks^2k16^n-2m-k\
        &=16^n-sum_m=0^lfloor n/3 rfloorbinomn-2mm(-1)^m16^n-3m\
        &,,colorblue=sum_m=1^lfloor n/3 rfloorbinomn-2mm(-1)^m+116^n-3mtag4\
        endalign*




        Finally we obtain from (4) in accordance with the answer of @antkam the wanted probability



        beginalign*
        colorblue[s^64]colorblueleft(frac11-16s-frac11-16s+s^3right)16^-64&colorblue=sum_m=1^21binom64-2mm(-1)^m+116^-3m\
        &colorbluesimeq 0.01503,16662,40506
        endalign*



        which gives roughly $colorblue1.5%$.







        share|cite|improve this answer











        $endgroup$



        This answer is based upon the Goulden-Jackson Cluster Method.




        We consider the set words of length $ngeq 0$ built from the alphabet $mathcalV=0,ldots,9,A,ldots,F$ and the set $B=123$ of bad words, which are not allowed to be part of the words we are looking for.



        • We derive a generating function $f(s)$ with the coefficient of $s^n$ being the number of these words of length $n$.


        • Since we are looking for the number of words which contain the bad word $123$, the resulting generating function is the generating function of all words minus $f(s)$
          beginalign*
          &1+16s+16^2s^2+16^3s^3cdots-f(s)=frac11-16s-f(s)
          endalign*




        According to the paper (p.7) the generating function $f(s)$ is
        beginalign*
        f(s)=frac11-ds-textweight(mathcalC)tag1
        endalign*

        with $d=|mathcalV|=16$, the size of the alphabet and $mathcalC$ the weight-numerator of bad words with
        beginalign*
        textweight(mathcalC)=textweight(mathcalC[123])
        endalign*



        We calculate according to the paper
        beginalign*
        textweight(mathcalC)=textweight(mathcalC[123])&=-s^3tag2\
        endalign*




        It follows from (1) and (2)



        beginalign*
        f(s)=frac11-16s+s^3\
        endalign*

        and the generated function counting all strings which contain $123$ is
        beginalign*
        &colorbluefrac11-16s-frac11-16s+s^3=s^3 + colorblue32 s^4 + 768 s^5 + 16,383 s^6 +cdotstag3\
        endalign*



        The coefficients of the series were calculated with the help of Wolfram Alpha. We see for instance there are $colorblue32$ words of length $4$ which contain the bad word $123$. These are $$(0..9,A..F)123qquadtext and qquad 123(0..9,A..F).$$




        The coefficient of $s^n$



        In fact we are interested in the coefficient of $s^64$. We calculate the coefficient of $s^n$ from (3). It is convenient to use the coefficient of operator $[s^n]$ to denote the coefficient of $s^n$ of a series.



        We obtain from (3) for $ngeq 0$



        beginalign*
        colorblue[s^n]&colorblueleft(frac11-16s-frac11-16s+s^3right)\
        &=[s^n]sum_m=0^infty 16^ms^m-[s^n]sum_m=0^infty s^m(16-s^2)^m\
        &=16^n-[s^n]sum_m=0^infty s^msum_k=0^mbinommk(-1)^ks^2k16^m-k\
        &=16^n-sum_m=0^n[s^n-m]sum_k=0^mbinommk(-1)^ks^2k16^m-k\
        &=16^n-sum_m=0^n[s^m]sum_k=0^n-mbinomn-mk(-1)^ks^2k16^n-m-k\
        &=16^n-sum_m=0^lfloor n/2 rfloor [s^2m]sum_k=0^n-2mbinomn-2mk(-1)^ks^2k16^n-2m-k\
        &=16^n-sum_m=0^lfloor n/3 rfloorbinomn-2mm(-1)^m16^n-3m\
        &,,colorblue=sum_m=1^lfloor n/3 rfloorbinomn-2mm(-1)^m+116^n-3mtag4\
        endalign*




        Finally we obtain from (4) in accordance with the answer of @antkam the wanted probability



        beginalign*
        colorblue[s^64]colorblueleft(frac11-16s-frac11-16s+s^3right)16^-64&colorblue=sum_m=1^21binom64-2mm(-1)^m+116^-3m\
        &colorbluesimeq 0.01503,16662,40506
        endalign*



        which gives roughly $colorblue1.5%$.








        share|cite|improve this answer














        share|cite|improve this answer



        share|cite|improve this answer








        edited Mar 27 at 18:14

























        answered Mar 27 at 18:04









        Markus ScheuerMarkus Scheuer

        63.5k460151




        63.5k460151





















            0












            $begingroup$

            Just a quick hint: total number of all possible strings is $16^32$, the number of strings that contain at least one occurrence of "$123$" is $32 cdot 16^29=2cdot 16^30$ ($32$ spots to put substring "$123$" and we need to fill remaining $29$ positions). The probability for truly random generation should be $$frac2cdot 16^3016^32=frac1128$$






            share|cite|improve this answer









            $endgroup$








            • 1




              $begingroup$
              the string length is 64, there are 62 places to put "123", and this argument double-counts the cases where "123" appears twice, etc.
              $endgroup$
              – antkam
              Apr 20 '18 at 14:19















            0












            $begingroup$

            Just a quick hint: total number of all possible strings is $16^32$, the number of strings that contain at least one occurrence of "$123$" is $32 cdot 16^29=2cdot 16^30$ ($32$ spots to put substring "$123$" and we need to fill remaining $29$ positions). The probability for truly random generation should be $$frac2cdot 16^3016^32=frac1128$$






            share|cite|improve this answer









            $endgroup$








            • 1




              $begingroup$
              the string length is 64, there are 62 places to put "123", and this argument double-counts the cases where "123" appears twice, etc.
              $endgroup$
              – antkam
              Apr 20 '18 at 14:19













            0












            0








            0





            $begingroup$

            Just a quick hint: total number of all possible strings is $16^32$, the number of strings that contain at least one occurrence of "$123$" is $32 cdot 16^29=2cdot 16^30$ ($32$ spots to put substring "$123$" and we need to fill remaining $29$ positions). The probability for truly random generation should be $$frac2cdot 16^3016^32=frac1128$$






            share|cite|improve this answer









            $endgroup$



            Just a quick hint: total number of all possible strings is $16^32$, the number of strings that contain at least one occurrence of "$123$" is $32 cdot 16^29=2cdot 16^30$ ($32$ spots to put substring "$123$" and we need to fill remaining $29$ positions). The probability for truly random generation should be $$frac2cdot 16^3016^32=frac1128$$







            share|cite|improve this answer












            share|cite|improve this answer



            share|cite|improve this answer










            answered Apr 20 '18 at 14:02









            VasyaVasya

            4,1471618




            4,1471618







            • 1




              $begingroup$
              the string length is 64, there are 62 places to put "123", and this argument double-counts the cases where "123" appears twice, etc.
              $endgroup$
              – antkam
              Apr 20 '18 at 14:19












            • 1




              $begingroup$
              the string length is 64, there are 62 places to put "123", and this argument double-counts the cases where "123" appears twice, etc.
              $endgroup$
              – antkam
              Apr 20 '18 at 14:19







            1




            1




            $begingroup$
            the string length is 64, there are 62 places to put "123", and this argument double-counts the cases where "123" appears twice, etc.
            $endgroup$
            – antkam
            Apr 20 '18 at 14:19




            $begingroup$
            the string length is 64, there are 62 places to put "123", and this argument double-counts the cases where "123" appears twice, etc.
            $endgroup$
            – antkam
            Apr 20 '18 at 14:19

















            draft saved

            draft discarded
















































            Thanks for contributing an answer to Mathematics Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f2746006%2fprobability-of-a-3-digits-occurence-from-a-sequence-of-64-random-hexadecimal-sym%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Triangular numbers and gcdProving sum of a set is $0 pmod n$ if $n$ is odd, or $fracn2 pmod n$ if $n$ is even?Is greatest common divisor of two numbers really their smallest linear combination?GCD, LCM RelationshipProve a set of nonnegative integers with greatest common divisor 1 and closed under addition has all but finite many nonnegative integers.all pairs of a and b in an equation containing gcdTriangular Numbers Modulo $k$ - Hit All Values?Understanding the Existence and Uniqueness of the GCDGCD and LCM with logical symbolsThe greatest common divisor of two positive integers less than 100 is equal to 3. Their least common multiple is twelve times one of the integers.Suppose that for all integers $x$, $x|a$ and $x|b$ if and only if $x|c$. Then $c = gcd(a,b)$Which is the gcd of 2 numbers which are multiplied and the result is 600000?

            Ingelân Ynhâld Etymology | Geografy | Skiednis | Polityk en bestjoer | Ekonomy | Demografy | Kultuer | Klimaat | Sjoch ek | Keppelings om utens | Boarnen, noaten en referinsjes Navigaasjemenuwww.gov.ukOffisjele webside fan it regear fan it Feriene KeninkrykOffisjele webside fan it Britske FerkearsburoNederlânsktalige ynformaasje fan it Britske FerkearsburoOffisjele webside fan English Heritage, de organisaasje dy't him ynset foar it behâld fan it Ingelske kultuergoedYnwennertallen fan alle Britske stêden út 'e folkstelling fan 2011Notes en References, op dizze sideEngland

            Հադիս Բովանդակություն Անվանում և նշանակություն | Դասակարգում | Աղբյուրներ | Նավարկման ցանկ