An examination of using Monte Carlo methods to extract statistically significant results from difficult problems.
Z. W. Miller - 1/22/18
import numpy as np
import random
import matplotlib.pyplot as plt
import pandas as pd
import math
import scipy
%matplotlib inline
plt.style.use('seaborn')
import numpy as np
import sklearn
import matplotlib
import pandas as pd
import sys
libraries = (('Matplotlib', matplotlib), ('Numpy', np), ('Pandas', pd))
print("Python Version:", sys.version, '\n')
for lib in libraries:
print('{0} Version: {1}'.format(lib[0], lib[1].__version__))
If we're going to try to defeat the video poker machine, we need to think about some of the approaches we could take. To do that, let's start by thinking about what video poker looks like. Here's a breakdown of the game (https://en.wikipedia.org/wiki/Video_poker). More importantly, let's think about the structure of the game:
- You bet some money.
- You are dealt 5 cards.
- You can choose none, any, or all of those cards to keep. Any cards you don't choose are replaced by a random deal.
- After the second deal, you check if you have any combinations of cards that result in you winning some money.
- You either lose your bet (if you have no winning hands), or you get the amount associated with the hand you have.
- Repeat until you're out of money or walk-away.
So, if we want to try to beat the machine we realistically need to do one thing - figure out which cards to keep and which ones to throw away every time we're dealt a fresh hand. There are two approaches you can do:
Calculate the exact odds of winning with every combination of cards you possess, and choose the best odds.
Simulate what MIGHT happen to decide if there is a statistically better move.
As you might have guessed, we're going to do the second one. To do that, we're going to need some machinery built up. The first thing we'll need is to teach Python what a deck of cards is. Let's get started on that. In order to make it easy to do, we're going to rely on classes.
The first thing we're going to build a "card" class. What the class will do is store all the information about each card inside a single object. So let's say we build one card that is the "Four of Diamonds" - we'll tell it: hey I want a object of the card class that an id = 4, and a suit = diamonds. Then later if we are scanning through all of our available cards and say, "hey show me what the id is (by doing card.id
) it should tell us, "well this one is a four."
Let's start out by building that:
class card:
def __init__(self, index, suit):
"""
Each card gets an ID and a suit so that we can build a deck of all the possible cards.
We'll also store a name and an id-to-name converter so that we can always get the name
really easily.
"""
self.id = index
self.suit = suit
self.id_to_name = {2: 'two', 3: 'three', 4:'four', 5:'five', 6:'six', 7:'seven',
8: 'eight', 9: 'nine', 10: 'ten', 11: 'jack', 12: 'queen',
13: 'king', 14: 'ace'}
self.suit_to_name = {'H': 'hearts', 'D': 'diamonds', 'C': 'clubs', 'S': 'spades'}
self.name = str(self.id_to_name[index])+' '+str(self.suit_to_name[suit])
Lovely! We've given every card type and id. Notice that we've encoded the jack, queen, king, and ace as the high numbered ids. So we'll have to keep track of that.
Now if we want to build a deck, we just need to loop through all of the suits and create a card for each value. In order to make this simple, let's make a "deck" class that will hold all the cards, and have a few extra things like shuffling and dealing built in. Since we're making this for poker, we'll it automatically deal five cards for now.
import random
import copy
class deck:
def __init__(self):
"""
Creates the deck and stores it in an attribute for later use.
Also makes it so we can store the "current hand" of the player
"""
self.deck = self.build_deck()
self.final_hand = None
def build_deck(self):
"""
Loops through all suits and card values (by id) to create all 52 cards.
"""
deck = []
suits = ['H','D','C','S']
for suit in suits:
for idx in range(2,15):
deck.append(card(idx, suit))
return deck
def shuffle(self):
"""
Shuffles the cards so they are in a random order
"""
random.shuffle(self.deck)
def deal_five(self):
"""
Puts the first five cards into the players hand
and sets the rest of the cards into a new attribute called
"remaining_cards"
"""
self.hand = self.deck[:5]
self.remaining_cards = self.deck[5:]
def draw_cards(self, ids_to_hold=[], shuffle_remaining=False):
"""
This is to be run after we deal 5. This will figure out how many
cards the player wants to hold from their hand (based on the input)
and then replace the rest of the cards from the "remaining_cards."
Since we're going to want to test this over and over, we'll
also add a "shuffle_remaining" option so that we can shuffle
the cards not in the players hand over and over if we want.
The "IDS to hold" tell which card locations we want to hold -
not the card id, but the card location in the hand. So if we want
to hold the first element in the hand list (in the 0th array spot) and
the 3rd card (2nd array spot), ids_to_hold = [0,2].
"""
new_hand = copy.copy(self.hand)
remaining_cards = copy.copy(self.remaining_cards)
if shuffle_remaining:
random.shuffle(remaining_cards)
for i, card in enumerate(new_hand):
if i not in ids_to_hold:
new_hand[i] = remaining_cards.pop(0)
self.final_hand = new_hand
def show_hand(self):
"""
This is just a pretty printing option so we can
see what's in the hand of the player.
"""
for c in self.hand:
print(c.name)
cards = deck()
cards.shuffle()
cards.deal_five()
for c in cards.hand:
print(c.name)
Woo! It worked. We have five cards in our hand. Let's make sure we have the rest of the cards as well. And to be pedantic we'll do a little bit of set math to make sure that no card appears in both the hand and the remaining cards.
for c in cards.remaining_cards:
print(c.name)
print("--- check for intersections ---")
print(set(cards.hand) & set(cards.remaining_cards))
Alright, we've got our deck of cards, the ability to shuffle it and the ability to deal some cards. BAM! On to phase II: We need to be able to know how good our hand is once we have it.
To do that, we're going to create a little scoring class. It will take in the hand that we have, and then use the information from the cards (the class 'card' we made before) to figure out how much the hand is worth. We'll use the following scoring system:
Hand | Payout (for bet of 1) |
---|---|
Royal Flush | 800 |
Straight Flush | 50 |
Four of a kind | 40 |
Full House | 9 |
Flush | 6 |
Straight | 4 |
Three of a Kind | 3 |
Two Pairs | 2 |
Pair Jacks or Higher | 1 |
I won't spend much time here discussing what those are - so if you aren't sure, look them up before proceeding. Throughout the next class, there are quite a few comments to explain each step. It's easier to see it as part of the code than for me to explain it before hand. Onwards!
from collections import Counter
class jacks_or_better_scorer:
def __init__(self, hand):
"""
Take a hand and do some checking on it. Make sure it's 5 cards.
Now get a list of suits and ids. We'll use the ids to check for
straights and pairs, and the suits to check for flushes. Then we'll
check if those exist simultaneously.
Then we'll take the maximum possible payout.
"""
assert len(hand)==5
self.ids = [x.id for x in hand]
self.suits = [x.suit for x in hand]
prs = self.check_for_pairs()
flsh = self.check_for_flush()
strt = self.check_for_straight()
strt_flsh = self.check_straight_flush(strt, flsh)
self.score = max([prs, flsh, strt, strt_flsh])
def check_for_pairs(self):
"""
The counter object returns a list of tuples, where the
tuple is (id, number of appearances). We'll check for
4 of a kind, then full house, then three of a kind, then
two pairs, then finally one pair (but the id has to be bigger
than 10, which means jack or higher). Whatever we find,
we return the correct payout.
"""
c = Counter(self.ids)
m = c.most_common()[:2]
if m[0][1] == 4:
return 25
elif m[0][1] == 3 and m[1][1] == 2:
return 9
elif m[0][1] == 3:
return 3
elif m[0][1] == 2 and m[1][1] == 2:
return 2
elif m[0][1] == 2 and m[0][0] >= 11:
return 1
else:
return 0
def check_for_flush(self):
"""
Using the counter object described in the pairs check, but now
we're just checking if all the suits are the same.
"""
c = Counter(self.suits)
m = c.most_common()[0][1]
if m == 5:
return 6
else:
return 0
def check_for_straight(self):
"""
Checking if the cards are in order using the straight helper
function. The confusing part here is to check it both if the
aces are counted as high and if they are counted as low.
"""
is_straight = 0
# section to handle if the ace is 1 instead of 14
if 14 in self.ids:
new_ids = [i if i != 14 else 1 for i in self.ids]
is_straight += self.straight_helper(new_ids)
# Check if straight with aces as 14
is_straight += self.straight_helper(self.ids)
if is_straight:
return 4
else:
return 0
def straight_helper(self, hand_ids):
"""
A helper function that sorts the card ids,
then goes through and makes sure that each card is
one higher than the previous. If it's not,
we mark it as a 0, not a straight.
"""
li2 = sorted(hand_ids)
it=iter(li2[1:])
if all(int(next(it))-int(i)==1 for i in li2[:-1]):
return 1
else:
return 0
def check_straight_flush(self, strt, flsh):
"""
Check if this is a straight flush. If it is
and both the king and ace are in there,
mark that it's a Royal flush and return the
biggest payout.
"""
if flsh and strt:
if 13 in self.ids and 14 in self.ids:
return 800
else:
return 50
else:
return 0
That looks like it should work, let's do some testing.
cards = deck()
cards.deal_five()
cards.show_hand()
jb = jacks_or_better_scorer(cards.hand)
print(jb.score)
That's right. We didn't do any shuffling, so our hand was just the first five cards we made. Those also happen to be a straight flush - worth 50 credits. I did a bunch more testing to make sure it works - and you should too, but for now I'm satisfied that it's all working. On to the actual Monte Carlo part!
So let's use a small example to explain our methodology here. The question that spawned this whole project is: if I have an initial deal that has both a single Queen (jack or higher) and a small pair (two fours), should I hold the queen or the fours? So let's try that. Note that I just shuffled the cards over and over until I got a hand that met this criteria!
cards = deck()
cards.shuffle()
cards.deal_five()
for c in cards.hand:
print(c.name)
For a single time, I could just do the draw cards and tell it to hold the queen in position 0. Then I look at the hand output. In this case, I'd have lost. But then we can run it through millions of times with different shuffles each time. By doing it lots and lots of times, I start to build up an understanding of what my expected win amount will be if I hold just the queen. Let's see that in action.
cards.draw_cards(ids_to_hold=[0], shuffle_remaining=True)
for c in cards.final_hand:
print(c.name)
jack_or_higher = []
for _ in range(1000000):
cards.draw_cards(ids_to_hold=[0], shuffle_remaining=True)
jb = jacks_or_better_scorer(cards.final_hand)
jack_or_higher.append(jb.score)
print("Expected Return: ", np.mean(jack_or_higher))
print("Max Win: ", np.max(jack_or_higher))
So I told it to run 1M times, shuffling the cards every time, so that I can see as many different combinations of the 4 replacement cards as possible. Each time I keep track of how much I won (with a 0 being added if I lost). Then I can see what my final expected win amount is. You can see that by holding the queen, I can expect to only win 0.467 credits (on average) when I do that. We can also see that it was possible for me to get a royal flush by holding the queen (since I did have a maximum winning of 800). So now, let's do the same thing, but instead of holding the queen, I'll hold the pair of 4s.
cards.draw_cards(ids_to_hold=[2,4], shuffle_remaining=True)
for c in cards.final_hand:
print(c.name)
jack_or_higher = []
for _ in range(1000000):
cards.draw_cards(ids_to_hold=[2,4], shuffle_remaining=True)
jb = jacks_or_better_scorer(cards.final_hand)
jack_or_higher.append(jb.score)
print("Expected Return: ", np.mean(jack_or_higher))
print("Max Win: ", np.max(jack_or_higher))
Nice! I have an almost double expected win rate! So it's way better to hold the two 4s. That just gives me way more chance of winning, since I can win with another 4, any other pair (doesn't have to be jacks or better), or any three of a kind. So on average, it's almost twice as smart to hold the pair instead of the lone face card.
One thing to note, in neither case do we break even on average. Yikes.
So if we want to try to beat video poker, what we can do is find every possible combination of card holding, then simulate the expected winnings by doing that hold. Then we just choose the best possible play every time.
This function will find every possible hold combination. We can see them printed out for an unshuffled deck below.
from itertools import combinations
def combinations(lst, depth, start=0, prepend=[]):
if depth <= 0:
yield prepend
else:
for i in range(start, len(lst)):
for c in combinations(lst, depth - 1, i + 1, prepend + [lst[i]]):
yield c
cards = deck()
cards.deal_five()
possible_hold_combos = []
for i in range(1,6):
for c in combinations(cards.hand, i):
possible_hold_combos.append(c)
len(possible_hold_combos)
for x in possible_hold_combos:
for c in x:
print(c.name)
print()
It will be easier to just get a bunch of lists of locations to hold than keep the names though. So let's redo it with just the locations.
cards = deck()
cards.shuffle()
cards.deal_five()
possible_hold_combos = [[]]
for i in range(1,6):
for c in combinations([0,1,2,3,4], i):
possible_hold_combos.append(c)
Now let's create a dictionary to keep track of our scoring.
d = {}
for c in possible_hold_combos:
d[str(c)] = []
And now this is where the Monte Carlo comes in. We'll go through each combination and try holding those cards. Then simulate 5000 hands and see what our average score is going to be. This takes a while to run, but that's okay because we know we're doing a lot of work. We are doing 31 combinations * 5000 simulations/combination = 155,000 simulations!
for combo in possible_hold_combos:
for _ in range(5000):
cards.draw_cards(ids_to_hold=combo, shuffle_remaining=True)
jb = jacks_or_better_scorer(cards.final_hand)
d[str(combo)].append(jb.score)
cards.show_hand()
And now that we have the results, let's see what we've got. So for this hand, it looks like the best move is to hold the pair of sixes... That makes sense!
results = []
for c, v in d.items():
results.append((c,np.mean(v)))
sorted(results, key=lambda x: x[1], reverse=True)
Now we need one more piece of machinery, the ability to grab the best possible play from our simulations.
eval(sorted(results, key=lambda x: x[1], reverse=True)[0][0])
Boom. Now we're ready to really play some poker.
So let's create a function where we tell it how much money we want to use, how many simulations we want to run per combination, and how many games it can play before deciding, "okay, you've played long enough" and stopping to see how much money you have left.
Note, there are a few other options here that we're going to use to determine what information comes back out of this function (basically so we can make some plots and stuff).
def play_poker(money, sim_strength=1000, max_count=10000, return_count=False, return_both=False, verbose=False):
money = 20
money_tally = [money]
count = 0
# Checks to see if we have enough money to play and that we haven't been playing too long
while money > 0 and count < max_count:
# bets 1 dollar and counts as playing
count += 1
money -= 1
# Get the cards setup
cards = deck()
cards.shuffle()
cards.deal_five()
# Set up our result checker
d = {}
for c in possible_hold_combos:
d[str(c)] = []
# Now loop through all the hands and check what the expected score is. This is the Monte
# Carlo part. We're using a bunch of random draws to see what the best move is statistically.
for combo in possible_hold_combos:
for i in range(sim_strength):
cards.draw_cards(ids_to_hold=combo, shuffle_remaining=True)
jb = jacks_or_better_scorer(cards.final_hand)
d[str(combo)].append(jb.score)
results = []
for c, v in d.items():
results.append((c,np.mean(v)))
# Get the best possible move
best_move = eval(sorted(results, key=lambda x: x[1], reverse=True)[0][0])
# Now actually draw the cards based on that move (note we shuffle here so we aren't just using
# whatever the last set of cards were. It would be cheating if we didn't shuffle.)
cards.draw_cards(ids_to_hold=best_move, shuffle_remaining=True)
winnings = jacks_or_better_scorer(cards.final_hand).score
# Now keep track of our winnings and print if we ask it to. Then play again.
money += winnings
money_tally.append(money)
if count%10 == 0 and verbose:
print("Hand %i, Money: %i"%(count,money))
# Return things, whether it be money or lists of moneys, or how long we've been playing.
if return_both:
return count, money_tally
elif return_count:
return count
else:
return money_tally
And now let's test it by watching our money from a game with verbose turned on.
money_tally = play_poker(20, sim_strength=100, verbose=True)
And now let's plot our money over time!
plt.figure(dpi=200)
max_len = len(money_tally)
plt.plot(range(max_len), money_tally);
plt.plot([0,max_len],[20,20],'r--', lw=2, alpha=0.5)
plt.xlabel("Number of Hands")
plt.ylabel("Amount of Money");
Nice! It all works, we've gained a bunch of money, then promptly lost it. Just like real gambling! In order to make sure our system is working, let's compare it to not going monte carlo and instead just playing totally at random. We better do a lot better than the random system.
def play_poker_randomly(money, max_count=10000, return_count=False, verbose=False):
money = 20
money_tally = [money]
count = 0
while money > 0 and count < max_count:
count += 1
money -= 1
cards = deck()
cards.shuffle()
cards.deal_five()
cards.draw_cards(ids_to_hold=np.random.choice(possible_hold_combos), shuffle_remaining=True)
winnings = jacks_or_better_scorer(cards.final_hand).score
money += winnings
money_tally.append(money)
if count%10 == 0 and verbose:
print("Hand %i, Money: %i"%(count,money))
if return_count:
return count
else:
return money_tally
Let's play 100 games randomly, and 100 games with our smarter system, then compare the results.
random_survive = []
for _ in range(100):
random_survive.append(play_poker_randomly(20, max_count=100, return_count=True))
smart_survive = []
for _ in range(100):
smart_survive.append(play_poker(20, max_count=100, sim_strength=100, return_count=True))
import seaborn as sns
plt.figure(dpi=200)
sns.distplot(random_survive, label="Random Play");
sns.distplot(smart_survive, color='r', label="Smartest Play");
plt.legend(loc='upper right');
plt.title("Number of Hands Survived with 20 Credits (Max 100)");
Awesome! That's very bi-modal. That means we have two totally different behaviors overall. Our smart playing is clrearly doing much, much better than randomly playing. What happens if we turn the number of simulations per combination down? We'd imagine that the Monte Carlo brain wouldn't gather as much information and therefore would be a bit dumber. Let's check.
smart_survive_2 = []
for _ in range(100):
smart_survive_2.append(play_poker(20, max_count=500, sim_strength=50, return_count=True))
plt.figure(dpi=200)
sns.distplot(smart_survive_2, color='r', label="Smartest Play", bins=20);
plt.legend(loc='upper right');
plt.title("Number of Hands Survived with 20 Credits (50 Sims per hold combo)");
smart_survive_3 = []
smart_tally_3 = []
for player_num in range(53):
print("Player ", player_num)
i,j = play_poker(20, max_count=1000, sim_strength=200, return_both=True)
smart_survive_3.append(i)
smart_tally_3.append(j)
plt.figure(dpi=200)
sns.distplot(smart_survive_3, color='r', label="Smartest Play", bins=20);
plt.legend(loc='upper right');
plt.title("Number of Hands Survived with 20 Credits (200 Sims per hold combo)");
Nice! Adding more simulations makes it smarter. It also adds a lot of computation time. I played with it a bit, and while going to a few thousand simulations per combo does make it smarter than only a few hundred... it's not by too much. So now we can actually answer the question - can you beat video poker - by letting a bunch of players play, then simulating a few hundred times per hold combination, and seeing how long they can last.
max_max_len = 0
plt.figure(dpi=200)
plt.xlabel("Number of Hands")
plt.ylabel("Amount of Money");
plt.title("Money vs Number of Hands (53 players - 200 sims per combo)")
for money_tally in smart_tally_3:
max_len = len(money_tally)
if max_len > max_max_len:
max_max_len = max_len
plt.plot(range(max_len), money_tally, lw=1);
plt.plot([0,max_max_len],[20,20],'k--', lw=2, alpha=0.5);
total_survivors = 0
total_profitors = 0
for x in smart_tally_3:
if len(x) == 1001:
total_survivors += 1
if x[-1] > 20:
total_profitors += 1
print("Percent lasting 1000 hands: ", str(round(total_survivors/len(smart_tally_3),3)))
print("Percent with profit after 1000 hands: ", str(round(total_profitors/len(smart_tally_3),3)))
As you can see, the vast, vast majority of players lose, even if playing as smart as possible. Out of 53 players, only 7% had ANY money left after 1000 hands. In fact, almost 80% lost everything within the first 100 hands! Only 5% had any profit after 1000 hands!
So, the system is weighted so heavily against the player that even if you play perfectly, the casino will come out on top 95% of the time. Given an infinite number of hands, the casino ALWAYS wins.