Interrogating the results of a Markov-chain simulation - Help and feedback highly appreciated Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)Markov chain stationary probability simulationFinding the steady state Markov chain?Calculating the information per symbol of a markov chain sourceFind the period of a state in a Markov chainFind expected value of time to reach a state in Markov chain, by simulationComputational methods for the limiting distribution of a finite ergodic Markov chainMarkov Chain HelpMarkov Chain and ExpectationSimulation of Markov chainIs the records a Markov chain?

How to answer "Have you ever been terminated?"

Withdrew £2800, but only £2000 shows as withdrawn on online banking; what are my obligations?

What is Arya's weapon design?

ListPlot join points by nearest neighbor rather than order

Using audio cues to encourage good posture

How can I make names more distinctive without making them longer?

List *all* the tuples!

How do I name drop voicings

How to deal with a team lead who never gives me credit?

Book where humans were engineered with genes from animal species to survive hostile planets

What is the logic behind the Maharil's explanation of why we don't say שעשה ניסים on Pesach?

Can I cast Passwall to drop an enemy into a 20-foot pit?

What exactly is a "Meth" in Altered Carbon?

Single word antonym of "flightless"

Why aren't air breathing engines used as small first stages

English words in a non-english sci-fi novel

When do you get frequent flier miles - when you buy, or when you fly?

Storing hydrofluoric acid before the invention of plastics

Fundamental Solution of the Pell Equation

What is known about the Ubaid lizard-people figurines?

What would be the ideal power source for a cybernetic eye?

How discoverable are IPv6 addresses and AAAA names by potential attackers?

Why didn't this character "real die" when they blew their stack out in Altered Carbon?

What is the meaning of the new sigil in Game of Thrones Season 8 intro?



Interrogating the results of a Markov-chain simulation - Help and feedback highly appreciated



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)Markov chain stationary probability simulationFinding the steady state Markov chain?Calculating the information per symbol of a markov chain sourceFind the period of a state in a Markov chainFind expected value of time to reach a state in Markov chain, by simulationComputational methods for the limiting distribution of a finite ergodic Markov chainMarkov Chain HelpMarkov Chain and ExpectationSimulation of Markov chainIs the records a Markov chain?










0












$begingroup$


I have built a Markov chain which simulates the daily routine of German residents (activity patterns). Each simulation day is divided into 144-time steps and the person can carry out one of fourteen activities. Those are:
Away - work (1)
Away - leisure (2)
Away - shopping (3)
Sleeping (4)
Cooking (5)
Using dishwasher (6)
Doing laundry (7)
Vacuuming (8)
Watching TV (9)
Using PC (10)
Personal hygiene (11)
Ironing (12)
Listening to music (13)
Other (14)



In my analysis, I want to link some of these activities to appliances in order to generate electricity load profiles for households.



I calculated the transition probabilities for the Markov chain from a data set containing a total of 226 diaries of persons who documented their activities in a 10-minute interval. The data set can be found here: https://www.dropbox.com/s/tyjvh810eg5brkr/data?dl=0



I have now simulated 4000 days with the Markov chain and now I want to validate the results by comparing them with the original data set.



For this purpose, I analyse the results for the activities individually by calculating the expected value as well as the 95% confidence interval for each time step. Then I check if the mean value for the activity from the original dataset is within this interval. Moreover, I calculate the number of all points that do not lie within the upper or lower threshold of the confidence interval. See the picture below for the results for a simulation for 4000 days. Please note that the y-axis has different values for each activity since they do not all equally appear in the original data set (sleeping is more likely to happen over the day than, for instance, ironing).



Results for a 4000 days simulation. Grey area is the 95% confidence interval while the blue line is the mean value for each time step calculated from the original data set.



The deviations, however, vary greatly for each simulation. I, therefore, performed the same simulation 4 times and could observe deviations ranging from 1% to more than 10%. The following table shows the results of my simulations.



Results for the four simulations. each simulation has been done for 4000 days.



Now I wonder, if these results are realistic and if such extreme deviations are possible at all at stochastic simulations or if my Markov chain probably has some mistakes in the code.



I could explain the differences by the fact that the original data set is too small to estimate the probabilities for the Markov chain correctly. However, activities such as "Sleeping" often occur in the original data set, but these values also vary greatly in the results.



I highly appreciate any feedback or help that you can provide.



Thanks,
Felix



Edit: This is the formula that I used to calculate the transition probabilities in Python:



# import necessary libraries

# First, create a 143xarray for transition probability matrix since the first state of the activity pattern will be determined by the starting probability which will be calculated later.

data= data(Dataframe vom dataset see link above)

# in case activity is not observed in time step t, return 0 instead divide by 0

def a1(x, y):
try:
return x / y
except ZeroDivisionError:
return 0

#create matrix for transition probabilities
matrix = np.empty(shape=(143, 14, 14))

#iterate through each time step and calculate probabilities for each activity (from state j to state m)

#save matrix as 3D-array
h=data
for i in range(0, 143):
for j in range(1, 15):
for m in range(1, 15):
a = a1(len(h[(h.iloc[:, i] == j) & (h.iloc[:, i + 1] == m)]), len(h[h.iloc[:, i] == j]))
matrix[i, j - 1, m - 1] = a
np.save(einp_wi_week, matrix)

# now calculate starting probabilities for time step 0 (initial start activity)

matrix_original_probability_starting_state= np.empty(shape=(14, 1))
for i in range(0, 1):
for j in range(1, 15):
a = a1(len(h[(h.iloc[:, i] == j)]), len(h.iloc[:, i]))
matrix_original_probability_starting_state[j - 1, i] = a
np.save(einp_wi_week_start, matrix_original_probability_starting_state)


# import transition probabilities for markov-chain (I have several worksheets to keep track of my approach and to document the results)


# MARKOV-CHAINS TO GENERATE ACTIVITY PATTERNS


# for every time step, the markov-chain can be either in one of these states / activities
states = ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14"]

# Possible sequences of state changes (since transferring from state 1 to state 11, for instance has the same code as transferring from 11 to 1, all state transfers for 1 to a number higher than 9 is adjusted in the transfer code)
transitionName = [["11", "12", "13", "14", "15", "16", "17", "18", "19", "0110", "0111", "0112", "0113", "0114"], ["21", "22", "23", "24", "25", "26", "27", "28", "29", "210", "211", "212", "213", "214"], ["31", "32", "33", "34", "35", "36", "37", "38", "39", "310", "311", "312", "313", "314"], ["41", "42", "43", "44", "45", "46", "47", "48", "49", "410", "411", "412", "413", "414"], ["51", "52", "53", "54", "55", "56", "57", "58", "59", "510", "511", "512", "513", "514"], ["61", "62", "63", "64", "65", "66", "67", "68", "69", "610", "611", "612", "613", "614"], ["71", "72", "73", "74", "75", "76", "77", "78", "79", "710", "711", "712", "713", "714"], ["81", "82", "83", "84", "85", "86", "87", "88", "89", "810", "811", "812", "813", "814"], ["91", "92", "93", "94", "95", "96", "97", "98", "99", "910", "911", "912", "913", "914"], ["101", "102", "103", "104", "105", "106", "107", "108", "109", "1010", "1011", "1012", "1013", "1014"], ["111", "112", "113", "114", "115", "116", "117", "118", "119", "1110", "1111", "1112", "1113", "1114"], ["121", "122", "123", "124", "125", "126", "127", "128", "129", "1210", "1211", "1212", "1213", "1214"], ["131", "132", "133", "134", "135", "136", "137", "138", "139", "1310", "1311", "1312", "1313", "1314"], ["141", "142", "143", "144", "145", "146", "147", "148", "149", "1410", "1411", "1412", "1413", "1414"]]


def activity_forecast(simulation_days):
# activitypattern is an empty array that stores the sequence of states in the simulation.
activitypattern = []
for m in range(0,simulation_days):
# for each simulation day, the activitypattern for all 144 time steps (10-min resolution) must be generated

# pro determines the transition probabilities matrix that is used for the simulation day and can be changed with regards to the day simulated (weekday, Friday, Saturday, Sunday).

# Need to be changed for the validation in order to validate activity patterns for weekdays, Fridays, Saturdays, and Sundays (see transition probability matrices above and comments later)

pro = einp_wi_week

#Determine starting activity. Therefore use vector with starting probabilities for all activites, generated in "Clustern_Ausgangsdaten..."
activityToday = np.random.choice(states,replace=True,p=einp_wi_week_start[:,0])



# create array were the sequence of states for simulation day is stored. This array will be appended to the activity pattern array

activityList = [activityToday]

#since the first activity state for each simulation day is determined, we have to generate activity states for the 143 following time steps
for i in range(0,143):
#create sequence of activities
if activityToday == "1":
change = np.random.choice(transitionName[0],replace=True,p=pro[i,0,:])
if change == "11":
activityList.append("1")
pass
elif change == "12":
activityToday = "2"
activityList.append("2")
elif change == "13":
activityToday = "3"
activityList.append("3")
elif change == "14":
activityToday = "4"
activityList.append("4")
elif change == "15":
activityToday = "5"
activityList.append("5")
elif change == "16":
activityToday = "6"
activityList.append("6")
elif change == "17":
activityToday = "7"
activityList.append("7")
elif change == "18":
activityToday = "8"
activityList.append("8")
elif change == "19":
activityToday = "9"
activityList.append("9")
elif change == "0110":
activityToday = "10"
activityList.append("10")
elif change == "0111":
activityToday = "11"
activityList.append("11")
elif change == "0112":
activityToday = "12"
activityList.append("12")
elif change == "0113":
activityToday = "13"
activityList.append("13")
else:
activityToday = "14"
activityList.append("14")
elif activityToday == "2":
change = np.random.choice(transitionName[1],replace=True,p=pro[i,1,:])
if change == "22":
activityList.append("2")
pass
elif change == "21":
activityToday = "1"
activityList.append("1")
elif change == "23":
activityToday = "3"
activityList.append("3")
elif change == "24":
activityToday = "4"
activityList.append("4")
elif change == "25":
activityToday = "5"
activityList.append("5")
elif change == "26":
activityToday = "6"
activityList.append("6")
elif change == "27":
activityToday = "7"
activityList.append("7")
elif change == "28":
activityToday = "8"
activityList.append("8")
elif change == "29":
activityToday = "9"
activityList.append("9")
elif change == "210":
activityToday = "10"
activityList.append("10")
elif change == "211":
activityToday = "11"
activityList.append("11")
elif change == "212":
activityToday = "12"
activityList.append("12")
elif change == "213":
activityToday = "13"
activityList.append("13")
else:
activityToday = "14"
activityList.append("14")
[code is too long - code needs to be written for each activity (1-14)]

activitypattern.append(activityList)
# save the files in order to use them later for comparison(and also for documentation since they can change for each simulation)
df=pd.DataFrame(activitypattern)
df.to_csv("Activitypatterns_synthetic_one_person_ft_aggregated_week_4000_days_new",index=False)

#call function
activity_forecast(4000)










share|cite|improve this question











$endgroup$











  • $begingroup$
    It would be useful to see the transition probability matrix you used.
    $endgroup$
    – Ian
    Apr 1 at 8:47










  • $begingroup$
    Hi @Ian: I copied my code into my question.
    $endgroup$
    – Felix Ziegler
    Apr 1 at 9:45











  • $begingroup$
    Hi Ian, did you have time to look at my code that I posted?
    $endgroup$
    – Felix Ziegler
    Apr 3 at 7:10















0












$begingroup$


I have built a Markov chain which simulates the daily routine of German residents (activity patterns). Each simulation day is divided into 144-time steps and the person can carry out one of fourteen activities. Those are:
Away - work (1)
Away - leisure (2)
Away - shopping (3)
Sleeping (4)
Cooking (5)
Using dishwasher (6)
Doing laundry (7)
Vacuuming (8)
Watching TV (9)
Using PC (10)
Personal hygiene (11)
Ironing (12)
Listening to music (13)
Other (14)



In my analysis, I want to link some of these activities to appliances in order to generate electricity load profiles for households.



I calculated the transition probabilities for the Markov chain from a data set containing a total of 226 diaries of persons who documented their activities in a 10-minute interval. The data set can be found here: https://www.dropbox.com/s/tyjvh810eg5brkr/data?dl=0



I have now simulated 4000 days with the Markov chain and now I want to validate the results by comparing them with the original data set.



For this purpose, I analyse the results for the activities individually by calculating the expected value as well as the 95% confidence interval for each time step. Then I check if the mean value for the activity from the original dataset is within this interval. Moreover, I calculate the number of all points that do not lie within the upper or lower threshold of the confidence interval. See the picture below for the results for a simulation for 4000 days. Please note that the y-axis has different values for each activity since they do not all equally appear in the original data set (sleeping is more likely to happen over the day than, for instance, ironing).



Results for a 4000 days simulation. Grey area is the 95% confidence interval while the blue line is the mean value for each time step calculated from the original data set.



The deviations, however, vary greatly for each simulation. I, therefore, performed the same simulation 4 times and could observe deviations ranging from 1% to more than 10%. The following table shows the results of my simulations.



Results for the four simulations. each simulation has been done for 4000 days.



Now I wonder, if these results are realistic and if such extreme deviations are possible at all at stochastic simulations or if my Markov chain probably has some mistakes in the code.



I could explain the differences by the fact that the original data set is too small to estimate the probabilities for the Markov chain correctly. However, activities such as "Sleeping" often occur in the original data set, but these values also vary greatly in the results.



I highly appreciate any feedback or help that you can provide.



Thanks,
Felix



Edit: This is the formula that I used to calculate the transition probabilities in Python:



# import necessary libraries

# First, create a 143xarray for transition probability matrix since the first state of the activity pattern will be determined by the starting probability which will be calculated later.

data= data(Dataframe vom dataset see link above)

# in case activity is not observed in time step t, return 0 instead divide by 0

def a1(x, y):
try:
return x / y
except ZeroDivisionError:
return 0

#create matrix for transition probabilities
matrix = np.empty(shape=(143, 14, 14))

#iterate through each time step and calculate probabilities for each activity (from state j to state m)

#save matrix as 3D-array
h=data
for i in range(0, 143):
for j in range(1, 15):
for m in range(1, 15):
a = a1(len(h[(h.iloc[:, i] == j) & (h.iloc[:, i + 1] == m)]), len(h[h.iloc[:, i] == j]))
matrix[i, j - 1, m - 1] = a
np.save(einp_wi_week, matrix)

# now calculate starting probabilities for time step 0 (initial start activity)

matrix_original_probability_starting_state= np.empty(shape=(14, 1))
for i in range(0, 1):
for j in range(1, 15):
a = a1(len(h[(h.iloc[:, i] == j)]), len(h.iloc[:, i]))
matrix_original_probability_starting_state[j - 1, i] = a
np.save(einp_wi_week_start, matrix_original_probability_starting_state)


# import transition probabilities for markov-chain (I have several worksheets to keep track of my approach and to document the results)


# MARKOV-CHAINS TO GENERATE ACTIVITY PATTERNS


# for every time step, the markov-chain can be either in one of these states / activities
states = ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14"]

# Possible sequences of state changes (since transferring from state 1 to state 11, for instance has the same code as transferring from 11 to 1, all state transfers for 1 to a number higher than 9 is adjusted in the transfer code)
transitionName = [["11", "12", "13", "14", "15", "16", "17", "18", "19", "0110", "0111", "0112", "0113", "0114"], ["21", "22", "23", "24", "25", "26", "27", "28", "29", "210", "211", "212", "213", "214"], ["31", "32", "33", "34", "35", "36", "37", "38", "39", "310", "311", "312", "313", "314"], ["41", "42", "43", "44", "45", "46", "47", "48", "49", "410", "411", "412", "413", "414"], ["51", "52", "53", "54", "55", "56", "57", "58", "59", "510", "511", "512", "513", "514"], ["61", "62", "63", "64", "65", "66", "67", "68", "69", "610", "611", "612", "613", "614"], ["71", "72", "73", "74", "75", "76", "77", "78", "79", "710", "711", "712", "713", "714"], ["81", "82", "83", "84", "85", "86", "87", "88", "89", "810", "811", "812", "813", "814"], ["91", "92", "93", "94", "95", "96", "97", "98", "99", "910", "911", "912", "913", "914"], ["101", "102", "103", "104", "105", "106", "107", "108", "109", "1010", "1011", "1012", "1013", "1014"], ["111", "112", "113", "114", "115", "116", "117", "118", "119", "1110", "1111", "1112", "1113", "1114"], ["121", "122", "123", "124", "125", "126", "127", "128", "129", "1210", "1211", "1212", "1213", "1214"], ["131", "132", "133", "134", "135", "136", "137", "138", "139", "1310", "1311", "1312", "1313", "1314"], ["141", "142", "143", "144", "145", "146", "147", "148", "149", "1410", "1411", "1412", "1413", "1414"]]


def activity_forecast(simulation_days):
# activitypattern is an empty array that stores the sequence of states in the simulation.
activitypattern = []
for m in range(0,simulation_days):
# for each simulation day, the activitypattern for all 144 time steps (10-min resolution) must be generated

# pro determines the transition probabilities matrix that is used for the simulation day and can be changed with regards to the day simulated (weekday, Friday, Saturday, Sunday).

# Need to be changed for the validation in order to validate activity patterns for weekdays, Fridays, Saturdays, and Sundays (see transition probability matrices above and comments later)

pro = einp_wi_week

#Determine starting activity. Therefore use vector with starting probabilities for all activites, generated in "Clustern_Ausgangsdaten..."
activityToday = np.random.choice(states,replace=True,p=einp_wi_week_start[:,0])



# create array were the sequence of states for simulation day is stored. This array will be appended to the activity pattern array

activityList = [activityToday]

#since the first activity state for each simulation day is determined, we have to generate activity states for the 143 following time steps
for i in range(0,143):
#create sequence of activities
if activityToday == "1":
change = np.random.choice(transitionName[0],replace=True,p=pro[i,0,:])
if change == "11":
activityList.append("1")
pass
elif change == "12":
activityToday = "2"
activityList.append("2")
elif change == "13":
activityToday = "3"
activityList.append("3")
elif change == "14":
activityToday = "4"
activityList.append("4")
elif change == "15":
activityToday = "5"
activityList.append("5")
elif change == "16":
activityToday = "6"
activityList.append("6")
elif change == "17":
activityToday = "7"
activityList.append("7")
elif change == "18":
activityToday = "8"
activityList.append("8")
elif change == "19":
activityToday = "9"
activityList.append("9")
elif change == "0110":
activityToday = "10"
activityList.append("10")
elif change == "0111":
activityToday = "11"
activityList.append("11")
elif change == "0112":
activityToday = "12"
activityList.append("12")
elif change == "0113":
activityToday = "13"
activityList.append("13")
else:
activityToday = "14"
activityList.append("14")
elif activityToday == "2":
change = np.random.choice(transitionName[1],replace=True,p=pro[i,1,:])
if change == "22":
activityList.append("2")
pass
elif change == "21":
activityToday = "1"
activityList.append("1")
elif change == "23":
activityToday = "3"
activityList.append("3")
elif change == "24":
activityToday = "4"
activityList.append("4")
elif change == "25":
activityToday = "5"
activityList.append("5")
elif change == "26":
activityToday = "6"
activityList.append("6")
elif change == "27":
activityToday = "7"
activityList.append("7")
elif change == "28":
activityToday = "8"
activityList.append("8")
elif change == "29":
activityToday = "9"
activityList.append("9")
elif change == "210":
activityToday = "10"
activityList.append("10")
elif change == "211":
activityToday = "11"
activityList.append("11")
elif change == "212":
activityToday = "12"
activityList.append("12")
elif change == "213":
activityToday = "13"
activityList.append("13")
else:
activityToday = "14"
activityList.append("14")
[code is too long - code needs to be written for each activity (1-14)]

activitypattern.append(activityList)
# save the files in order to use them later for comparison(and also for documentation since they can change for each simulation)
df=pd.DataFrame(activitypattern)
df.to_csv("Activitypatterns_synthetic_one_person_ft_aggregated_week_4000_days_new",index=False)

#call function
activity_forecast(4000)










share|cite|improve this question











$endgroup$











  • $begingroup$
    It would be useful to see the transition probability matrix you used.
    $endgroup$
    – Ian
    Apr 1 at 8:47










  • $begingroup$
    Hi @Ian: I copied my code into my question.
    $endgroup$
    – Felix Ziegler
    Apr 1 at 9:45











  • $begingroup$
    Hi Ian, did you have time to look at my code that I posted?
    $endgroup$
    – Felix Ziegler
    Apr 3 at 7:10













0












0








0





$begingroup$


I have built a Markov chain which simulates the daily routine of German residents (activity patterns). Each simulation day is divided into 144-time steps and the person can carry out one of fourteen activities. Those are:
Away - work (1)
Away - leisure (2)
Away - shopping (3)
Sleeping (4)
Cooking (5)
Using dishwasher (6)
Doing laundry (7)
Vacuuming (8)
Watching TV (9)
Using PC (10)
Personal hygiene (11)
Ironing (12)
Listening to music (13)
Other (14)



In my analysis, I want to link some of these activities to appliances in order to generate electricity load profiles for households.



I calculated the transition probabilities for the Markov chain from a data set containing a total of 226 diaries of persons who documented their activities in a 10-minute interval. The data set can be found here: https://www.dropbox.com/s/tyjvh810eg5brkr/data?dl=0



I have now simulated 4000 days with the Markov chain and now I want to validate the results by comparing them with the original data set.



For this purpose, I analyse the results for the activities individually by calculating the expected value as well as the 95% confidence interval for each time step. Then I check if the mean value for the activity from the original dataset is within this interval. Moreover, I calculate the number of all points that do not lie within the upper or lower threshold of the confidence interval. See the picture below for the results for a simulation for 4000 days. Please note that the y-axis has different values for each activity since they do not all equally appear in the original data set (sleeping is more likely to happen over the day than, for instance, ironing).



Results for a 4000 days simulation. Grey area is the 95% confidence interval while the blue line is the mean value for each time step calculated from the original data set.



The deviations, however, vary greatly for each simulation. I, therefore, performed the same simulation 4 times and could observe deviations ranging from 1% to more than 10%. The following table shows the results of my simulations.



Results for the four simulations. each simulation has been done for 4000 days.



Now I wonder, if these results are realistic and if such extreme deviations are possible at all at stochastic simulations or if my Markov chain probably has some mistakes in the code.



I could explain the differences by the fact that the original data set is too small to estimate the probabilities for the Markov chain correctly. However, activities such as "Sleeping" often occur in the original data set, but these values also vary greatly in the results.



I highly appreciate any feedback or help that you can provide.



Thanks,
Felix



Edit: This is the formula that I used to calculate the transition probabilities in Python:



# import necessary libraries

# First, create a 143xarray for transition probability matrix since the first state of the activity pattern will be determined by the starting probability which will be calculated later.

data= data(Dataframe vom dataset see link above)

# in case activity is not observed in time step t, return 0 instead divide by 0

def a1(x, y):
try:
return x / y
except ZeroDivisionError:
return 0

#create matrix for transition probabilities
matrix = np.empty(shape=(143, 14, 14))

#iterate through each time step and calculate probabilities for each activity (from state j to state m)

#save matrix as 3D-array
h=data
for i in range(0, 143):
for j in range(1, 15):
for m in range(1, 15):
a = a1(len(h[(h.iloc[:, i] == j) & (h.iloc[:, i + 1] == m)]), len(h[h.iloc[:, i] == j]))
matrix[i, j - 1, m - 1] = a
np.save(einp_wi_week, matrix)

# now calculate starting probabilities for time step 0 (initial start activity)

matrix_original_probability_starting_state= np.empty(shape=(14, 1))
for i in range(0, 1):
for j in range(1, 15):
a = a1(len(h[(h.iloc[:, i] == j)]), len(h.iloc[:, i]))
matrix_original_probability_starting_state[j - 1, i] = a
np.save(einp_wi_week_start, matrix_original_probability_starting_state)


# import transition probabilities for markov-chain (I have several worksheets to keep track of my approach and to document the results)


# MARKOV-CHAINS TO GENERATE ACTIVITY PATTERNS


# for every time step, the markov-chain can be either in one of these states / activities
states = ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14"]

# Possible sequences of state changes (since transferring from state 1 to state 11, for instance has the same code as transferring from 11 to 1, all state transfers for 1 to a number higher than 9 is adjusted in the transfer code)
transitionName = [["11", "12", "13", "14", "15", "16", "17", "18", "19", "0110", "0111", "0112", "0113", "0114"], ["21", "22", "23", "24", "25", "26", "27", "28", "29", "210", "211", "212", "213", "214"], ["31", "32", "33", "34", "35", "36", "37", "38", "39", "310", "311", "312", "313", "314"], ["41", "42", "43", "44", "45", "46", "47", "48", "49", "410", "411", "412", "413", "414"], ["51", "52", "53", "54", "55", "56", "57", "58", "59", "510", "511", "512", "513", "514"], ["61", "62", "63", "64", "65", "66", "67", "68", "69", "610", "611", "612", "613", "614"], ["71", "72", "73", "74", "75", "76", "77", "78", "79", "710", "711", "712", "713", "714"], ["81", "82", "83", "84", "85", "86", "87", "88", "89", "810", "811", "812", "813", "814"], ["91", "92", "93", "94", "95", "96", "97", "98", "99", "910", "911", "912", "913", "914"], ["101", "102", "103", "104", "105", "106", "107", "108", "109", "1010", "1011", "1012", "1013", "1014"], ["111", "112", "113", "114", "115", "116", "117", "118", "119", "1110", "1111", "1112", "1113", "1114"], ["121", "122", "123", "124", "125", "126", "127", "128", "129", "1210", "1211", "1212", "1213", "1214"], ["131", "132", "133", "134", "135", "136", "137", "138", "139", "1310", "1311", "1312", "1313", "1314"], ["141", "142", "143", "144", "145", "146", "147", "148", "149", "1410", "1411", "1412", "1413", "1414"]]


def activity_forecast(simulation_days):
# activitypattern is an empty array that stores the sequence of states in the simulation.
activitypattern = []
for m in range(0,simulation_days):
# for each simulation day, the activitypattern for all 144 time steps (10-min resolution) must be generated

# pro determines the transition probabilities matrix that is used for the simulation day and can be changed with regards to the day simulated (weekday, Friday, Saturday, Sunday).

# Need to be changed for the validation in order to validate activity patterns for weekdays, Fridays, Saturdays, and Sundays (see transition probability matrices above and comments later)

pro = einp_wi_week

#Determine starting activity. Therefore use vector with starting probabilities for all activites, generated in "Clustern_Ausgangsdaten..."
activityToday = np.random.choice(states,replace=True,p=einp_wi_week_start[:,0])



# create array were the sequence of states for simulation day is stored. This array will be appended to the activity pattern array

activityList = [activityToday]

#since the first activity state for each simulation day is determined, we have to generate activity states for the 143 following time steps
for i in range(0,143):
#create sequence of activities
if activityToday == "1":
change = np.random.choice(transitionName[0],replace=True,p=pro[i,0,:])
if change == "11":
activityList.append("1")
pass
elif change == "12":
activityToday = "2"
activityList.append("2")
elif change == "13":
activityToday = "3"
activityList.append("3")
elif change == "14":
activityToday = "4"
activityList.append("4")
elif change == "15":
activityToday = "5"
activityList.append("5")
elif change == "16":
activityToday = "6"
activityList.append("6")
elif change == "17":
activityToday = "7"
activityList.append("7")
elif change == "18":
activityToday = "8"
activityList.append("8")
elif change == "19":
activityToday = "9"
activityList.append("9")
elif change == "0110":
activityToday = "10"
activityList.append("10")
elif change == "0111":
activityToday = "11"
activityList.append("11")
elif change == "0112":
activityToday = "12"
activityList.append("12")
elif change == "0113":
activityToday = "13"
activityList.append("13")
else:
activityToday = "14"
activityList.append("14")
elif activityToday == "2":
change = np.random.choice(transitionName[1],replace=True,p=pro[i,1,:])
if change == "22":
activityList.append("2")
pass
elif change == "21":
activityToday = "1"
activityList.append("1")
elif change == "23":
activityToday = "3"
activityList.append("3")
elif change == "24":
activityToday = "4"
activityList.append("4")
elif change == "25":
activityToday = "5"
activityList.append("5")
elif change == "26":
activityToday = "6"
activityList.append("6")
elif change == "27":
activityToday = "7"
activityList.append("7")
elif change == "28":
activityToday = "8"
activityList.append("8")
elif change == "29":
activityToday = "9"
activityList.append("9")
elif change == "210":
activityToday = "10"
activityList.append("10")
elif change == "211":
activityToday = "11"
activityList.append("11")
elif change == "212":
activityToday = "12"
activityList.append("12")
elif change == "213":
activityToday = "13"
activityList.append("13")
else:
activityToday = "14"
activityList.append("14")
[code is too long - code needs to be written for each activity (1-14)]

activitypattern.append(activityList)
# save the files in order to use them later for comparison(and also for documentation since they can change for each simulation)
df=pd.DataFrame(activitypattern)
df.to_csv("Activitypatterns_synthetic_one_person_ft_aggregated_week_4000_days_new",index=False)

#call function
activity_forecast(4000)










share|cite|improve this question











$endgroup$




I have built a Markov chain which simulates the daily routine of German residents (activity patterns). Each simulation day is divided into 144-time steps and the person can carry out one of fourteen activities. Those are:
Away - work (1)
Away - leisure (2)
Away - shopping (3)
Sleeping (4)
Cooking (5)
Using dishwasher (6)
Doing laundry (7)
Vacuuming (8)
Watching TV (9)
Using PC (10)
Personal hygiene (11)
Ironing (12)
Listening to music (13)
Other (14)



In my analysis, I want to link some of these activities to appliances in order to generate electricity load profiles for households.



I calculated the transition probabilities for the Markov chain from a data set containing a total of 226 diaries of persons who documented their activities in a 10-minute interval. The data set can be found here: https://www.dropbox.com/s/tyjvh810eg5brkr/data?dl=0



I have now simulated 4000 days with the Markov chain and now I want to validate the results by comparing them with the original data set.



For this purpose, I analyse the results for the activities individually by calculating the expected value as well as the 95% confidence interval for each time step. Then I check if the mean value for the activity from the original dataset is within this interval. Moreover, I calculate the number of all points that do not lie within the upper or lower threshold of the confidence interval. See the picture below for the results for a simulation for 4000 days. Please note that the y-axis has different values for each activity since they do not all equally appear in the original data set (sleeping is more likely to happen over the day than, for instance, ironing).



Results for a 4000 days simulation. Grey area is the 95% confidence interval while the blue line is the mean value for each time step calculated from the original data set.



The deviations, however, vary greatly for each simulation. I, therefore, performed the same simulation 4 times and could observe deviations ranging from 1% to more than 10%. The following table shows the results of my simulations.



Results for the four simulations. each simulation has been done for 4000 days.



Now I wonder, if these results are realistic and if such extreme deviations are possible at all at stochastic simulations or if my Markov chain probably has some mistakes in the code.



I could explain the differences by the fact that the original data set is too small to estimate the probabilities for the Markov chain correctly. However, activities such as "Sleeping" often occur in the original data set, but these values also vary greatly in the results.



I highly appreciate any feedback or help that you can provide.



Thanks,
Felix



Edit: This is the formula that I used to calculate the transition probabilities in Python:



# import necessary libraries

# First, create a 143xarray for transition probability matrix since the first state of the activity pattern will be determined by the starting probability which will be calculated later.

data= data(Dataframe vom dataset see link above)

# in case activity is not observed in time step t, return 0 instead divide by 0

def a1(x, y):
try:
return x / y
except ZeroDivisionError:
return 0

#create matrix for transition probabilities
matrix = np.empty(shape=(143, 14, 14))

#iterate through each time step and calculate probabilities for each activity (from state j to state m)

#save matrix as 3D-array
h=data
for i in range(0, 143):
for j in range(1, 15):
for m in range(1, 15):
a = a1(len(h[(h.iloc[:, i] == j) & (h.iloc[:, i + 1] == m)]), len(h[h.iloc[:, i] == j]))
matrix[i, j - 1, m - 1] = a
np.save(einp_wi_week, matrix)

# now calculate starting probabilities for time step 0 (initial start activity)

matrix_original_probability_starting_state= np.empty(shape=(14, 1))
for i in range(0, 1):
for j in range(1, 15):
a = a1(len(h[(h.iloc[:, i] == j)]), len(h.iloc[:, i]))
matrix_original_probability_starting_state[j - 1, i] = a
np.save(einp_wi_week_start, matrix_original_probability_starting_state)


# import transition probabilities for markov-chain (I have several worksheets to keep track of my approach and to document the results)


# MARKOV-CHAINS TO GENERATE ACTIVITY PATTERNS


# for every time step, the markov-chain can be either in one of these states / activities
states = ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14"]

# Possible sequences of state changes (since transferring from state 1 to state 11, for instance has the same code as transferring from 11 to 1, all state transfers for 1 to a number higher than 9 is adjusted in the transfer code)
transitionName = [["11", "12", "13", "14", "15", "16", "17", "18", "19", "0110", "0111", "0112", "0113", "0114"], ["21", "22", "23", "24", "25", "26", "27", "28", "29", "210", "211", "212", "213", "214"], ["31", "32", "33", "34", "35", "36", "37", "38", "39", "310", "311", "312", "313", "314"], ["41", "42", "43", "44", "45", "46", "47", "48", "49", "410", "411", "412", "413", "414"], ["51", "52", "53", "54", "55", "56", "57", "58", "59", "510", "511", "512", "513", "514"], ["61", "62", "63", "64", "65", "66", "67", "68", "69", "610", "611", "612", "613", "614"], ["71", "72", "73", "74", "75", "76", "77", "78", "79", "710", "711", "712", "713", "714"], ["81", "82", "83", "84", "85", "86", "87", "88", "89", "810", "811", "812", "813", "814"], ["91", "92", "93", "94", "95", "96", "97", "98", "99", "910", "911", "912", "913", "914"], ["101", "102", "103", "104", "105", "106", "107", "108", "109", "1010", "1011", "1012", "1013", "1014"], ["111", "112", "113", "114", "115", "116", "117", "118", "119", "1110", "1111", "1112", "1113", "1114"], ["121", "122", "123", "124", "125", "126", "127", "128", "129", "1210", "1211", "1212", "1213", "1214"], ["131", "132", "133", "134", "135", "136", "137", "138", "139", "1310", "1311", "1312", "1313", "1314"], ["141", "142", "143", "144", "145", "146", "147", "148", "149", "1410", "1411", "1412", "1413", "1414"]]


def activity_forecast(simulation_days):
# activitypattern is an empty array that stores the sequence of states in the simulation.
activitypattern = []
for m in range(0,simulation_days):
# for each simulation day, the activitypattern for all 144 time steps (10-min resolution) must be generated

# pro determines the transition probabilities matrix that is used for the simulation day and can be changed with regards to the day simulated (weekday, Friday, Saturday, Sunday).

# Need to be changed for the validation in order to validate activity patterns for weekdays, Fridays, Saturdays, and Sundays (see transition probability matrices above and comments later)

pro = einp_wi_week

#Determine starting activity. Therefore use vector with starting probabilities for all activites, generated in "Clustern_Ausgangsdaten..."
activityToday = np.random.choice(states,replace=True,p=einp_wi_week_start[:,0])



# create array were the sequence of states for simulation day is stored. This array will be appended to the activity pattern array

activityList = [activityToday]

#since the first activity state for each simulation day is determined, we have to generate activity states for the 143 following time steps
for i in range(0,143):
#create sequence of activities
if activityToday == "1":
change = np.random.choice(transitionName[0],replace=True,p=pro[i,0,:])
if change == "11":
activityList.append("1")
pass
elif change == "12":
activityToday = "2"
activityList.append("2")
elif change == "13":
activityToday = "3"
activityList.append("3")
elif change == "14":
activityToday = "4"
activityList.append("4")
elif change == "15":
activityToday = "5"
activityList.append("5")
elif change == "16":
activityToday = "6"
activityList.append("6")
elif change == "17":
activityToday = "7"
activityList.append("7")
elif change == "18":
activityToday = "8"
activityList.append("8")
elif change == "19":
activityToday = "9"
activityList.append("9")
elif change == "0110":
activityToday = "10"
activityList.append("10")
elif change == "0111":
activityToday = "11"
activityList.append("11")
elif change == "0112":
activityToday = "12"
activityList.append("12")
elif change == "0113":
activityToday = "13"
activityList.append("13")
else:
activityToday = "14"
activityList.append("14")
elif activityToday == "2":
change = np.random.choice(transitionName[1],replace=True,p=pro[i,1,:])
if change == "22":
activityList.append("2")
pass
elif change == "21":
activityToday = "1"
activityList.append("1")
elif change == "23":
activityToday = "3"
activityList.append("3")
elif change == "24":
activityToday = "4"
activityList.append("4")
elif change == "25":
activityToday = "5"
activityList.append("5")
elif change == "26":
activityToday = "6"
activityList.append("6")
elif change == "27":
activityToday = "7"
activityList.append("7")
elif change == "28":
activityToday = "8"
activityList.append("8")
elif change == "29":
activityToday = "9"
activityList.append("9")
elif change == "210":
activityToday = "10"
activityList.append("10")
elif change == "211":
activityToday = "11"
activityList.append("11")
elif change == "212":
activityToday = "12"
activityList.append("12")
elif change == "213":
activityToday = "13"
activityList.append("13")
else:
activityToday = "14"
activityList.append("14")
[code is too long - code needs to be written for each activity (1-14)]

activitypattern.append(activityList)
# save the files in order to use them later for comparison(and also for documentation since they can change for each simulation)
df=pd.DataFrame(activitypattern)
df.to_csv("Activitypatterns_synthetic_one_person_ft_aggregated_week_4000_days_new",index=False)

#call function
activity_forecast(4000)







markov-chains markov-process simulation python






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Apr 1 at 9:48







Felix Ziegler

















asked Apr 1 at 8:34









Felix ZieglerFelix Ziegler

11




11











  • $begingroup$
    It would be useful to see the transition probability matrix you used.
    $endgroup$
    – Ian
    Apr 1 at 8:47










  • $begingroup$
    Hi @Ian: I copied my code into my question.
    $endgroup$
    – Felix Ziegler
    Apr 1 at 9:45











  • $begingroup$
    Hi Ian, did you have time to look at my code that I posted?
    $endgroup$
    – Felix Ziegler
    Apr 3 at 7:10
















  • $begingroup$
    It would be useful to see the transition probability matrix you used.
    $endgroup$
    – Ian
    Apr 1 at 8:47










  • $begingroup$
    Hi @Ian: I copied my code into my question.
    $endgroup$
    – Felix Ziegler
    Apr 1 at 9:45











  • $begingroup$
    Hi Ian, did you have time to look at my code that I posted?
    $endgroup$
    – Felix Ziegler
    Apr 3 at 7:10















$begingroup$
It would be useful to see the transition probability matrix you used.
$endgroup$
– Ian
Apr 1 at 8:47




$begingroup$
It would be useful to see the transition probability matrix you used.
$endgroup$
– Ian
Apr 1 at 8:47












$begingroup$
Hi @Ian: I copied my code into my question.
$endgroup$
– Felix Ziegler
Apr 1 at 9:45





$begingroup$
Hi @Ian: I copied my code into my question.
$endgroup$
– Felix Ziegler
Apr 1 at 9:45













$begingroup$
Hi Ian, did you have time to look at my code that I posted?
$endgroup$
– Felix Ziegler
Apr 3 at 7:10




$begingroup$
Hi Ian, did you have time to look at my code that I posted?
$endgroup$
– Felix Ziegler
Apr 3 at 7:10










0






active

oldest

votes












Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "69"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3170348%2finterrogating-the-results-of-a-markov-chain-simulation-help-and-feedback-highl%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Mathematics Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3170348%2finterrogating-the-results-of-a-markov-chain-simulation-help-and-feedback-highl%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Triangular numbers and gcdProving sum of a set is $0 pmod n$ if $n$ is odd, or $fracn2 pmod n$ if $n$ is even?Is greatest common divisor of two numbers really their smallest linear combination?GCD, LCM RelationshipProve a set of nonnegative integers with greatest common divisor 1 and closed under addition has all but finite many nonnegative integers.all pairs of a and b in an equation containing gcdTriangular Numbers Modulo $k$ - Hit All Values?Understanding the Existence and Uniqueness of the GCDGCD and LCM with logical symbolsThe greatest common divisor of two positive integers less than 100 is equal to 3. Their least common multiple is twelve times one of the integers.Suppose that for all integers $x$, $x|a$ and $x|b$ if and only if $x|c$. Then $c = gcd(a,b)$Which is the gcd of 2 numbers which are multiplied and the result is 600000?

Barbados Ynhâld Skiednis | Geografy | Demografy | Navigaasjemenu

Σερβία Πίνακας περιεχομένων Γεωγραφία | Ιστορία | Πολιτική | Δημογραφία | Οικονομία | Τουρισμός | Εκπαίδευση και επιστήμη | Πολιτισμός | Δείτε επίσης | Παραπομπές | Εξωτερικοί σύνδεσμοι | Μενού πλοήγησης43°49′00″N 21°08′00″E / 43.8167°N 21.1333°E / 43.8167; 21.133344°49′14″N 20°27′44″E / 44.8206°N 20.4622°E / 44.8206; 20.4622 (Βελιγράδι)Επίσημη εκτίμηση«Σερβία»«Human Development Report 2018»Παγκόσμιος Οργανισμός Υγείας, Προσδόκιμο ζωής και υγιές προσδόκιμο ζωής, Δεδομένα ανά χώρα2003 statistics2004 statistics2005 statistics2006 statistics2007 statistics2008 statistics2009-2013 statistics2014 statisticsStatistical Yearbook of the Republic of Serbia – Tourism, 20152016 statisticsStatistical Yearbook of the Republic of Serbia – Tourism, 2015Πληροφορίες σχετικά με τη Σερβία και τον πολιτισμό τηςΣερβική ΠροεδρίαΕθνικός Οργανισμός Τουρισμού της ΣερβίαςΣερβική ΕθνοσυνέλευσηΣερβίαεε