Can you beat Video Poker?

An examination of using Monte Carlo methods to extract statistically significant results from difficult problems.

Z. W. Miller - 1/22/18

In [1]:
import numpy as np
import random
import matplotlib.pyplot as plt
import pandas as pd
import math
import scipy
%matplotlib inline
plt.style.use('seaborn')
In [2]:
import numpy as np
import sklearn
import matplotlib
import pandas as pd
import sys
libraries = (('Matplotlib', matplotlib), ('Numpy', np), ('Pandas', pd))

print("Python Version:", sys.version, '\n')
for lib in libraries:
    print('{0} Version: {1}'.format(lib[0], lib[1].__version__))
Python Version: 3.6.2 |Anaconda custom (64-bit)| (default, Sep 21 2017, 18:29:43) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] 

Matplotlib Version: 2.0.2
Numpy Version: 1.12.1
Pandas Version: 0.20.3

Setting up the problem and the cards

If we're going to try to defeat the video poker machine, we need to think about some of the approaches we could take. To do that, let's start by thinking about what video poker looks like. Here's a breakdown of the game (https://en.wikipedia.org/wiki/Video_poker). More importantly, let's think about the structure of the game:

 - You bet some money.
 - You are dealt 5 cards. 
 - You can choose none, any, or all of those cards to keep. Any cards you don't choose are replaced by a random deal.
 - After the second deal, you check if you have any combinations of cards that result in you winning some money.
 - You either lose your bet (if you have no winning hands), or you get the amount associated with the hand you have.
 - Repeat until you're out of money or walk-away.

So, if we want to try to beat the machine we realistically need to do one thing - figure out which cards to keep and which ones to throw away every time we're dealt a fresh hand. There are two approaches you can do:

  1. Calculate the exact odds of winning with every combination of cards you possess, and choose the best odds.

  2. Simulate what MIGHT happen to decide if there is a statistically better move.

As you might have guessed, we're going to do the second one. To do that, we're going to need some machinery built up. The first thing we'll need is to teach Python what a deck of cards is. Let's get started on that. In order to make it easy to do, we're going to rely on classes.

The first thing we're going to build a "card" class. What the class will do is store all the information about each card inside a single object. So let's say we build one card that is the "Four of Diamonds" - we'll tell it: hey I want a object of the card class that an id = 4, and a suit = diamonds. Then later if we are scanning through all of our available cards and say, "hey show me what the id is (by doing card.id) it should tell us, "well this one is a four."

Let's start out by building that:

In [3]:
class card:
    
    def __init__(self, index, suit):
        """
        Each card gets an ID and a suit so that we can build a deck of all the possible cards.
        We'll also store a name and an id-to-name converter so that we can always get the name
        really easily. 
        """
        self.id = index
        self.suit = suit
        self.id_to_name = {2: 'two', 3: 'three', 4:'four', 5:'five', 6:'six', 7:'seven',
                      8: 'eight', 9: 'nine', 10: 'ten', 11: 'jack', 12: 'queen',
                      13: 'king', 14: 'ace'}
        self.suit_to_name = {'H': 'hearts', 'D': 'diamonds', 'C': 'clubs', 'S': 'spades'}
        self.name = str(self.id_to_name[index])+' '+str(self.suit_to_name[suit])       

Lovely! We've given every card type and id. Notice that we've encoded the jack, queen, king, and ace as the high numbered ids. So we'll have to keep track of that.

Now if we want to build a deck, we just need to loop through all of the suits and create a card for each value. In order to make this simple, let's make a "deck" class that will hold all the cards, and have a few extra things like shuffling and dealing built in. Since we're making this for poker, we'll it automatically deal five cards for now.

In [4]:
import random
import copy

class deck:
    
    def __init__(self):
        """
        Creates the deck and stores it in an attribute for later use.
        Also makes it so we can store the "current hand" of the player
        """
        self.deck = self.build_deck()
        self.final_hand = None
        
    def build_deck(self):
        """
        Loops through all suits and card values (by id) to create all 52 cards.
        """
        deck = []
        suits = ['H','D','C','S']
        for suit in suits:
            for idx in range(2,15):
                deck.append(card(idx, suit))
        return deck
    
    def shuffle(self):
        """
        Shuffles the cards so they are in a random order
        """
        random.shuffle(self.deck)
        
    def deal_five(self):
        """
        Puts the first five cards into the players hand
        and sets the rest of the cards into a new attribute called
        "remaining_cards"
        """
        self.hand = self.deck[:5]
        self.remaining_cards = self.deck[5:]
    
    def draw_cards(self, ids_to_hold=[], shuffle_remaining=False):
        """
        This is to be run after we deal 5. This will figure out how many
        cards the player wants to hold from their hand (based on the input)
        and then replace the rest of the cards from the "remaining_cards."
        Since we're going to want to test this over and over, we'll
        also add a "shuffle_remaining" option so that we can shuffle
        the cards not in the players hand over and over if we want.
        The "IDS to hold" tell which card locations we want to hold - 
        not the card id, but the card location in the hand. So if we want
        to hold the first element in the hand list (in the 0th array spot) and 
        the 3rd card (2nd array spot), ids_to_hold = [0,2].
        """
        new_hand = copy.copy(self.hand)            
        remaining_cards = copy.copy(self.remaining_cards)
        
        if shuffle_remaining:
            random.shuffle(remaining_cards)
            
        for i, card in enumerate(new_hand):
            if i not in ids_to_hold:
                new_hand[i] = remaining_cards.pop(0)
        
        self.final_hand = new_hand
    
    def show_hand(self):
        """
        This is just a pretty printing option so we can
        see what's in the hand of the player.
        """
        for c in self.hand:
            print(c.name)
In [5]:
cards = deck()
cards.shuffle()
cards.deal_five()

for c in cards.hand:
    print(c.name)
six clubs
nine clubs
seven spades
king diamonds
jack diamonds
In [10]:
 
Out[10]:
set()

Woo! It worked. We have five cards in our hand. Let's make sure we have the rest of the cards as well. And to be pedantic we'll do a little bit of set math to make sure that no card appears in both the hand and the remaining cards.

In [12]:
for c in cards.remaining_cards:
    print(c.name)
    
print("--- check for intersections ---")
print(set(cards.hand) & set(cards.remaining_cards))
six diamonds
two hearts
ten diamonds
six hearts
king hearts
jack spades
jack hearts
ace diamonds
eight clubs
ten clubs
four hearts
six spades
four clubs
two spades
king spades
nine diamonds
nine hearts
eight spades
queen clubs
three clubs
seven hearts
eight hearts
three spades
five clubs
ten spades
ace clubs
queen spades
three diamonds
ten hearts
five hearts
jack clubs
queen diamonds
ace spades
queen hearts
four spades
seven clubs
two clubs
king clubs
seven diamonds
five spades
five diamonds
ace hearts
nine spades
eight diamonds
two diamonds
four diamonds
three hearts
--- check for intersections ---
set()

Alright, we've got our deck of cards, the ability to shuffle it and the ability to deal some cards. BAM! On to phase II: We need to be able to know how good our hand is once we have it.

To do that, we're going to create a little scoring class. It will take in the hand that we have, and then use the information from the cards (the class 'card' we made before) to figure out how much the hand is worth. We'll use the following scoring system:

Hand Payout (for bet of 1)
Royal Flush 800
Straight Flush 50
Four of a kind 40
Full House 9
Flush 6
Straight 4
Three of a Kind 3
Two Pairs 2
Pair Jacks or Higher 1

I won't spend much time here discussing what those are - so if you aren't sure, look them up before proceeding. Throughout the next class, there are quite a few comments to explain each step. It's easier to see it as part of the code than for me to explain it before hand. Onwards!

In [13]:
from collections import Counter

class jacks_or_better_scorer:
    
    def __init__(self, hand):
        """
        Take a hand and do some checking on it. Make sure it's 5 cards.
        Now get a list of suits and ids. We'll use the ids to check for
        straights and pairs, and the suits to check for flushes. Then we'll
        check if those exist simultaneously.
        
        Then we'll take the maximum possible payout.
        """
        assert len(hand)==5
        self.ids = [x.id for x in hand]
        self.suits = [x.suit for x in hand]
        prs = self.check_for_pairs()
        flsh = self.check_for_flush()
        strt = self.check_for_straight()
        strt_flsh = self.check_straight_flush(strt, flsh)
        self.score = max([prs, flsh, strt, strt_flsh])
        
    def check_for_pairs(self):
        """
        The counter object returns a list of tuples, where the 
        tuple is (id, number of appearances). We'll check for
        4 of a kind, then full house, then three of a kind, then
        two pairs, then finally one pair (but the id has to be bigger
        than 10, which means jack or higher). Whatever we find,
        we return the correct payout.
        """
        c = Counter(self.ids)
        m = c.most_common()[:2]
        if m[0][1] == 4:
            return 25
        elif m[0][1] == 3 and m[1][1] == 2:
            return 9  
        elif m[0][1] == 3:
            return 3
        elif m[0][1] == 2 and m[1][1] == 2:
            return 2
        elif m[0][1] == 2 and m[0][0] >= 11:
            return 1
        else:
            return 0
        
    def check_for_flush(self): 
        """
        Using the counter object described in the pairs check, but now
        we're just checking if all the suits are the same.
        """
        c = Counter(self.suits)
        m = c.most_common()[0][1]
        if m == 5:
            return 6
        else:
            return 0
        
    def check_for_straight(self):
        """
        Checking if the cards are in order using the straight helper
        function. The confusing part here is to check it both if the
        aces are counted as high and if they are counted as low.
        """
        is_straight = 0
        
        # section to handle if the ace is 1 instead of 14
        if 14 in self.ids:
            new_ids = [i if i != 14 else 1 for i in self.ids]
            is_straight += self.straight_helper(new_ids)
        
        # Check if straight with aces as 14
        is_straight += self.straight_helper(self.ids)
        
        if is_straight:
            return 4
        else:
            return 0
        
    def straight_helper(self, hand_ids):
        """
        A helper function that sorts the card ids,
        then goes through and makes sure that each card is
        one higher than the previous. If it's not, 
        we mark it as a 0, not a straight.
        """
        li2 = sorted(hand_ids)
        it=iter(li2[1:])
        if all(int(next(it))-int(i)==1 for i in li2[:-1]):
            return 1
        else:
            return 0
        
    def check_straight_flush(self, strt, flsh):
        """
        Check if this is a straight flush. If it is
        and both the king and ace are in there, 
        mark that it's a Royal flush and return the
        biggest payout. 
        """
        if flsh and strt:
            if 13 in self.ids and 14 in self.ids:
                return 800
            else:
                return 50
        else:
            return 0
        

That looks like it should work, let's do some testing.

In [16]:
cards = deck()
cards.deal_five()
cards.show_hand()
jb = jacks_or_better_scorer(cards.hand)
print(jb.score)
two hearts
three hearts
four hearts
five hearts
six hearts
50

That's right. We didn't do any shuffling, so our hand was just the first five cards we made. Those also happen to be a straight flush - worth 50 credits. I did a bunch more testing to make sure it works - and you should too, but for now I'm satisfied that it's all working. On to the actual Monte Carlo part!

Can you actually beat Video Poker?

So let's use a small example to explain our methodology here. The question that spawned this whole project is: if I have an initial deal that has both a single Queen (jack or higher) and a small pair (two fours), should I hold the queen or the fours? So let's try that. Note that I just shuffled the cards over and over until I got a hand that met this criteria!

In [159]:
cards = deck()
cards.shuffle()
cards.deal_five()

for c in cards.hand:
    print(c.name)
queen diamonds
five spades
four clubs
jack hearts
four diamonds

Hold single Jack or higher card

For a single time, I could just do the draw cards and tell it to hold the queen in position 0. Then I look at the hand output. In this case, I'd have lost. But then we can run it through millions of times with different shuffles each time. By doing it lots and lots of times, I start to build up an understanding of what my expected win amount will be if I hold just the queen. Let's see that in action.

In [160]:
cards.draw_cards(ids_to_hold=[0], shuffle_remaining=True)
for c in cards.final_hand:
    print(c.name)
queen diamonds
ace spades
nine clubs
king diamonds
jack spades
In [165]:
jack_or_higher = []
for _ in range(1000000):
    cards.draw_cards(ids_to_hold=[0], shuffle_remaining=True)
    jb = jacks_or_better_scorer(cards.final_hand)
    jack_or_higher.append(jb.score)
print("Expected Return: ", np.mean(jack_or_higher))
print("Max Win: ", np.max(jack_or_higher))
Expected Return:  0.466921
Max Win:  800

So I told it to run 1M times, shuffling the cards every time, so that I can see as many different combinations of the 4 replacement cards as possible. Each time I keep track of how much I won (with a 0 being added if I lost). Then I can see what my final expected win amount is. You can see that by holding the queen, I can expect to only win 0.467 credits (on average) when I do that. We can also see that it was possible for me to get a royal flush by holding the queen (since I did have a maximum winning of 800). So now, let's do the same thing, but instead of holding the queen, I'll hold the pair of 4s.

Hold small pair (4s)

In [162]:
cards.draw_cards(ids_to_hold=[2,4], shuffle_remaining=True)
for c in cards.final_hand:
    print(c.name)
queen spades
king diamonds
four clubs
nine clubs
four diamonds
In [166]:
jack_or_higher = []
for _ in range(1000000):
    cards.draw_cards(ids_to_hold=[2,4], shuffle_remaining=True)
    jb = jacks_or_better_scorer(cards.final_hand)
    jack_or_higher.append(jb.score)
print("Expected Return: ", np.mean(jack_or_higher))
print("Max Win: ", np.max(jack_or_higher))
Expected Return:  0.825666
Max Win:  25

Nice! I have an almost double expected win rate! So it's way better to hold the two 4s. That just gives me way more chance of winning, since I can win with another 4, any other pair (doesn't have to be jacks or better), or any three of a kind. So on average, it's almost twice as smart to hold the pair instead of the lone face card.

One thing to note, in neither case do we break even on average. Yikes.

So if we want to try to beat video poker, what we can do is find every possible combination of card holding, then simulate the expected winnings by doing that hold. Then we just choose the best possible play every time.

Now let's simulate each possible "hold" possibility from our initial deal

This function will find every possible hold combination. We can see them printed out for an unshuffled deck below.

In [17]:
from itertools import combinations

def combinations(lst, depth, start=0, prepend=[]):
    if depth <= 0:
        yield prepend
    else:
        for i in range(start, len(lst)):
            for c in combinations(lst, depth - 1, i + 1, prepend + [lst[i]]):
                yield c
cards = deck()
cards.deal_five()
possible_hold_combos = []
for i in range(1,6):
    for c in combinations(cards.hand, i):
        possible_hold_combos.append(c)
In [18]:
len(possible_hold_combos)
Out[18]:
31
In [19]:
for x in possible_hold_combos:
    for c in x:
        print(c.name)
    print()
two hearts

three hearts

four hearts

five hearts

six hearts

two hearts
three hearts

two hearts
four hearts

two hearts
five hearts

two hearts
six hearts

three hearts
four hearts

three hearts
five hearts

three hearts
six hearts

four hearts
five hearts

four hearts
six hearts

five hearts
six hearts

two hearts
three hearts
four hearts

two hearts
three hearts
five hearts

two hearts
three hearts
six hearts

two hearts
four hearts
five hearts

two hearts
four hearts
six hearts

two hearts
five hearts
six hearts

three hearts
four hearts
five hearts

three hearts
four hearts
six hearts

three hearts
five hearts
six hearts

four hearts
five hearts
six hearts

two hearts
three hearts
four hearts
five hearts

two hearts
three hearts
four hearts
six hearts

two hearts
three hearts
five hearts
six hearts

two hearts
four hearts
five hearts
six hearts

three hearts
four hearts
five hearts
six hearts

two hearts
three hearts
four hearts
five hearts
six hearts

It will be easier to just get a bunch of lists of locations to hold than keep the names though. So let's redo it with just the locations.

In [20]:
cards = deck()
cards.shuffle()
cards.deal_five()
In [21]:
possible_hold_combos = [[]]
for i in range(1,6):
    for c in combinations([0,1,2,3,4], i):
        possible_hold_combos.append(c)

Now let's create a dictionary to keep track of our scoring.

In [22]:
d = {}
for c in possible_hold_combos:
    d[str(c)] = []

And now this is where the Monte Carlo comes in. We'll go through each combination and try holding those cards. Then simulate 5000 hands and see what our average score is going to be. This takes a while to run, but that's okay because we know we're doing a lot of work. We are doing 31 combinations * 5000 simulations/combination = 155,000 simulations!

In [23]:
for combo in possible_hold_combos:
    for _ in range(5000):
        cards.draw_cards(ids_to_hold=combo, shuffle_remaining=True)
        jb = jacks_or_better_scorer(cards.final_hand)
        d[str(combo)].append(jb.score)
In [24]:
cards.show_hand()
seven spades
nine spades
six hearts
three diamonds
six diamonds

And now that we have the results, let's see what we've got. So for this hand, it looks like the best move is to hold the pair of sixes... That makes sense!

In [25]:
results = []
for c, v in d.items():
    results.append((c,np.mean(v)))

sorted(results, key=lambda x: x[1], reverse=True)
Out[25]:
[('[2, 4]', 0.85980000000000001),
 ('[1, 2, 4]', 0.70020000000000004),
 ('[0, 2, 4]', 0.69779999999999998),
 ('[2, 3, 4]', 0.63560000000000005),
 ('[1, 2, 3, 4]', 0.38519999999999999),
 ('[0, 2, 3, 4]', 0.37019999999999997),
 ('[0, 1, 2, 4]', 0.36859999999999998),
 ('[]', 0.36699999999999999),
 ('[3]', 0.31380000000000002),
 ('[1]', 0.31340000000000001),
 ('[0]', 0.30719999999999997),
 ('[0, 1]', 0.29859999999999998),
 ('[2]', 0.25800000000000001),
 ('[4]', 0.24879999999999999),
 ('[3, 4]', 0.2432),
 ('[0, 2]', 0.2306),
 ('[0, 3]', 0.22),
 ('[1, 3]', 0.216),
 ('[0, 1, 2]', 0.2112),
 ('[0, 1, 4]', 0.2034),
 ('[2, 3]', 0.1978),
 ('[0, 4]', 0.1946),
 ('[1, 4]', 0.1832),
 ('[1, 2]', 0.18079999999999999),
 ('[0, 3, 4]', 0.1406),
 ('[0, 2, 3]', 0.1376),
 ('[0, 1, 3]', 0.10100000000000001),
 ('[1, 3, 4]', 0.082000000000000003),
 ('[1, 2, 3]', 0.080799999999999997),
 ('[0, 1, 2, 3]', 0.0),
 ('[0, 1, 3, 4]', 0.0),
 ('[0, 1, 2, 3, 4]', 0.0)]

Now we need one more piece of machinery, the ability to grab the best possible play from our simulations.

In [26]:
eval(sorted(results, key=lambda x: x[1], reverse=True)[0][0])
Out[26]:
[2, 4]

Boom. Now we're ready to really play some poker.

Now let's see how it does actually playing and see if it can actually come out ahead.

So let's create a function where we tell it how much money we want to use, how many simulations we want to run per combination, and how many games it can play before deciding, "okay, you've played long enough" and stopping to see how much money you have left.

Note, there are a few other options here that we're going to use to determine what information comes back out of this function (basically so we can make some plots and stuff).

In [285]:
def play_poker(money, sim_strength=1000, max_count=10000, return_count=False, return_both=False, verbose=False):
    money = 20
    money_tally = [money]
    count = 0
    
    # Checks to see if we have enough money to play and that we haven't been playing too long
    while money > 0 and count < max_count:
        # bets 1 dollar and counts as playing
        count += 1
        money -= 1 
        
        # Get the cards setup
        cards = deck()
        cards.shuffle()
        cards.deal_five()

        # Set up our result checker
        d = {}
        for c in possible_hold_combos:
            d[str(c)] = []
                  
        # Now loop through all the hands and check what the expected score is. This is the Monte
        # Carlo part. We're using a bunch of random draws to see what the best move is statistically.
        for combo in possible_hold_combos:
            for i in range(sim_strength):
                cards.draw_cards(ids_to_hold=combo, shuffle_remaining=True)
                jb = jacks_or_better_scorer(cards.final_hand)
                d[str(combo)].append(jb.score)

        results = []
        for c, v in d.items():
            results.append((c,np.mean(v)))
        
        # Get the best possible move
        best_move = eval(sorted(results, key=lambda x: x[1], reverse=True)[0][0])
        
        # Now actually draw the cards based on that move (note we shuffle here so we aren't just using
        # whatever the last set of cards were. It would be cheating if we didn't shuffle.)
        cards.draw_cards(ids_to_hold=best_move, shuffle_remaining=True)
        winnings = jacks_or_better_scorer(cards.final_hand).score
        
        # Now keep track of our winnings and print if we ask it to. Then play again.
        money += winnings
        money_tally.append(money)
        if count%10 == 0 and verbose:
            print("Hand %i, Money: %i"%(count,money))
    
    # Return things, whether it be money or lists of moneys, or how long we've been playing.
    if return_both:
        return count, money_tally
    elif return_count:
        return count
    else:
        return money_tally

And now let's test it by watching our money from a game with verbose turned on.

In [264]:
money_tally = play_poker(20, sim_strength=100, verbose=True)
Hand 10, Money: 17
Hand 20, Money: 27
Hand 30, Money: 39
Hand 40, Money: 39
Hand 50, Money: 40
Hand 60, Money: 43
Hand 70, Money: 42
Hand 80, Money: 50
Hand 90, Money: 51
Hand 100, Money: 48
Hand 110, Money: 44
Hand 120, Money: 44
Hand 130, Money: 50
Hand 140, Money: 47
Hand 150, Money: 49
Hand 160, Money: 45
Hand 170, Money: 44
Hand 180, Money: 37
Hand 190, Money: 34
Hand 200, Money: 38
Hand 210, Money: 45
Hand 220, Money: 40
Hand 230, Money: 43
Hand 240, Money: 63
Hand 250, Money: 63
Hand 260, Money: 59
Hand 270, Money: 61
Hand 280, Money: 54
Hand 290, Money: 49
Hand 300, Money: 49
Hand 310, Money: 44
Hand 320, Money: 52
Hand 330, Money: 48
Hand 340, Money: 56
Hand 350, Money: 51
Hand 360, Money: 46
Hand 370, Money: 46
Hand 380, Money: 45
Hand 390, Money: 46
Hand 400, Money: 43
Hand 410, Money: 39
Hand 420, Money: 30
Hand 430, Money: 24
Hand 440, Money: 22
Hand 450, Money: 24
Hand 460, Money: 24
Hand 470, Money: 25
Hand 480, Money: 25
Hand 490, Money: 20
Hand 500, Money: 22
Hand 510, Money: 22
Hand 520, Money: 20
Hand 530, Money: 25
Hand 540, Money: 19
Hand 550, Money: 16
Hand 560, Money: 9
Hand 570, Money: 0

And now let's plot our money over time!

In [266]:
plt.figure(dpi=200)
max_len = len(money_tally)
plt.plot(range(max_len), money_tally);
plt.plot([0,max_len],[20,20],'r--', lw=2, alpha=0.5)
plt.xlabel("Number of Hands")
plt.ylabel("Amount of Money");

Nice! It all works, we've gained a bunch of money, then promptly lost it. Just like real gambling! In order to make sure our system is working, let's compare it to not going monte carlo and instead just playing totally at random. We better do a lot better than the random system.

Let's compare this to randomly choosing a combo to hold.

In [267]:
def play_poker_randomly(money, max_count=10000, return_count=False, verbose=False):
    money = 20
    money_tally = [money]
    count = 0
    while money > 0 and count < max_count:
        count += 1
        money -= 1 
        cards = deck()
        cards.shuffle()
        cards.deal_five()

        cards.draw_cards(ids_to_hold=np.random.choice(possible_hold_combos), shuffle_remaining=True)
        winnings = jacks_or_better_scorer(cards.final_hand).score
        money += winnings
        money_tally.append(money)
        if count%10 == 0 and verbose:
            print("Hand %i, Money: %i"%(count,money))
    if return_count:
        return count
    else:
        return money_tally

Let's play 100 games randomly, and 100 games with our smarter system, then compare the results.

In [271]:
random_survive = []
for _ in range(100):
    random_survive.append(play_poker_randomly(20, max_count=100, return_count=True))
In [276]:
smart_survive = []
for _ in range(100):
    smart_survive.append(play_poker(20, max_count=100, sim_strength=100, return_count=True))
In [278]:
import seaborn as sns
plt.figure(dpi=200)
sns.distplot(random_survive, label="Random Play");
sns.distplot(smart_survive, color='r', label="Smartest Play");
plt.legend(loc='upper right');
plt.title("Number of Hands Survived with 20 Credits (Max 100)");

Awesome! That's very bi-modal. That means we have two totally different behaviors overall. Our smart playing is clrearly doing much, much better than randomly playing. What happens if we turn the number of simulations per combination down? We'd imagine that the Monte Carlo brain wouldn't gather as much information and therefore would be a bit dumber. Let's check.

In [282]:
smart_survive_2 = []
for _ in range(100):
    smart_survive_2.append(play_poker(20, max_count=500, sim_strength=50, return_count=True))
In [284]:
plt.figure(dpi=200)
sns.distplot(smart_survive_2, color='r', label="Smartest Play", bins=20);
plt.legend(loc='upper right');
plt.title("Number of Hands Survived with 20 Credits (50 Sims per hold combo)");
In [ ]:
smart_survive_3 = []
smart_tally_3 = []
for player_num in range(53):
    print("Player ", player_num)
    i,j = play_poker(20, max_count=1000, sim_strength=200, return_both=True)
    smart_survive_3.append(i)
    smart_tally_3.append(j)
In [301]:
plt.figure(dpi=200)
sns.distplot(smart_survive_3, color='r', label="Smartest Play", bins=20);
plt.legend(loc='upper right');
plt.title("Number of Hands Survived with 20 Credits (200 Sims per hold combo)");

Nice! Adding more simulations makes it smarter. It also adds a lot of computation time. I played with it a bit, and while going to a few thousand simulations per combo does make it smarter than only a few hundred... it's not by too much. So now we can actually answer the question - can you beat video poker - by letting a bunch of players play, then simulating a few hundred times per hold combination, and seeing how long they can last.

In [302]:
max_max_len = 0
plt.figure(dpi=200)
plt.xlabel("Number of Hands")
plt.ylabel("Amount of Money");
plt.title("Money vs Number of Hands (53 players - 200 sims per combo)")
for money_tally in smart_tally_3:
    max_len = len(money_tally)
    if max_len > max_max_len:
        max_max_len = max_len
    plt.plot(range(max_len), money_tally, lw=1);
plt.plot([0,max_max_len],[20,20],'k--', lw=2, alpha=0.5);
In [307]:
total_survivors = 0
total_profitors = 0
for x in smart_tally_3:
    if len(x) == 1001:
        total_survivors += 1
        if x[-1] > 20:
            total_profitors += 1
            
print("Percent lasting 1000 hands: ", str(round(total_survivors/len(smart_tally_3),3)))
print("Percent with profit after 1000 hands: ", str(round(total_profitors/len(smart_tally_3),3)))
Percent lasting 1000 hands:  0.075
Percent with profit after 1000 hands:  0.057

As you can see, the vast, vast majority of players lose, even if playing as smart as possible. Out of 53 players, only 7% had ANY money left after 1000 hands. In fact, almost 80% lost everything within the first 100 hands! Only 5% had any profit after 1000 hands!

So, the system is weighted so heavily against the player that even if you play perfectly, the casino will come out on top 95% of the time. Given an infinite number of hands, the casino ALWAYS wins.