copyright ©David Ness, 9 July 2001
[Draft
#1 --- 2001/07/11 0517 UTC]
This document is the second one of a series that are being written to describe programs that simulate some well-known games. The purpose of these documents is to illustrate how some interesting computer languages can be used to deal with problems that do not have an `obvious' computational formulation. That is to say, these problems are not inherently or obviously numerical in character.
Thus our purpose is to expose how these problems might be dealt with by some of these new programming languages, particularly J and K . We recognize, at the outset, that we are not trying to contribute (much) to the study of the games themselves, but rather only using them as a vehicle to help us expose the characteristics of the languages.
Games are an interesting problem domain. In general they are characterized by:
Many games, and here poker is a good example, are complicated enough that is is not trivial for us to deal in an analytic fashion with the possibilities, yet simple enough that we can often---given the power of today's computers---perform exhaustive enumerations of the game space to allow us to check the quality of the simulation techniques that we use to help us understand the games.
It is also often the case that very simple changes to the rules of the game can produce situations which are complicated enough to make any analytic formulations impossible. An example of this is a game like `Deuces Wild' which creates situations of considerable complexity out of very simple rules.
J and K belong to a class of vector (or array) oriented interpretive languages. These languages, while interpreted, generally perform much better in problem solving situations than many people might suspect, because they implement basic vector and matrix operations with such efficiency that their overall performance is often excellent. While interpreters generally consume more computer resource than compiled programs with respect to some aspects of a problem, the languages we are looking at here are so efficient in their handling of vectors and matrices that they may, on balance, be very effective in handling certain classes of problems.
These languages are regularly used in situations where high performance real time work needs to be done. Indeed, both languages have considerable currency on Wall Street where they are often used to perform central core operations in trading systems that need to be able to handle analytical calculations on large data bases which are conjoined with streams of active market data.
This document is generated in two versions. In one version the examples and programs are all expressed in J , while in the other they are all expressed in K.
The programs given here are early attempts by the author in the use of the languages we describe. More experienced programmers might find better formulations, but as our purpose is to look at the application of the languages, our hope is that well-explained code will help the readers come to understand at least one approach to the problems, and that it will spur them on to develop their own formulations.
Card games fall into a number of broad classes. The games in each class share many common characteristics. Some of these categories, and their primary characteristics, include:
We might choose different representations for each of these games. For example, if we were interested in talking about Bridge, the concept of `suit' turns out to be very important. In many situations one has to answer questions like `Does the hand have any Spades' There is also a ranking of suits and a concept of a trump suit.
For Poker, this isn't the case. While the value of some of the hands may be dependent on simple characteristics of suit (i.e. they are all of the same suit), for the most part suit is of little interest, while `denomination' plays a rather large role. This leads to some natural choices with respect to representation.
We begin by developing some technology that will allow us to talk about hands of Cards. In order to be able to do this, though, we will have to have enough knowledge of K to have the code we present make sense.
The fundamental underlying data structure implemented in the code that will be presented here is to store each card as a pair of integers where the first element (an integer between 0 and 12) represents the denomination of the card (deuce, trey, ..., king, ace) and the second element (an integer between 0 and 3) represents the suit.
Underneath it all, of course, is a fundamental representation of a card as a number between 0 and 51. Given this representation, an n card hand is a combination of n elements taken from this domain. Notice, however, that this particular form for cards is not a particularly convenient one to work with as, for example, the cards numbered 12, 25, 38, and 51 have no directly apparent relation to one another, yet when translated to the representation used here they all have a first element of 12 and thus represent the four aces.
After introducing a few simple operators, we will present and discuss the code to implement this data structure.
K operators all have names. Actually most operators have two names, one used when the operator is monadic and the other used when the operator is dyadic. For example, ^ represents `shape' when monadic and `power' when dyadic. Similarly, % is `reciprocal' when monadic and `divided by' when dyadic.
We also note that K is 0-origin. The `first' column of a matrix is called Column 0. This applies to everything in K. Further, arguably the most important operator, at least from a readability standpoint, is :, which is used for assignment, as for example in a:2 which assigns 2 to the name a.
Operators we are about to need can now be described briefly. K `Help' facilities can be used to get the complete word on any of these definitions. These amazingly terse and helpful facilities are reached by typing a backslant \.
The basic math operations are + - * %, and they all have the usual dyadic meaning. In K the monadic meaning of some of these operators is quite far removed from that in other languages, and this is worth noting early, as it takes a bit of getting used to if you have experience with other programming languages.
In K a function definition has a very simple form:
In the definition the `argument parameters' are assumed to be x y z unless a declaration makes them something else. The functions given here happen to be monadic, so only the variable x appears in the definitions.
Let's start with the definition of Card. The argument of Card is assumed to be and integer in the 0 thru 51 range (or a list of such integers). The definition `parses' around the ; which is the list constructor. Since K (like APL and J) is strictly right-to-left in evaluation, the first calculation is to divide the argument (x) by 13, and then take the floor (_) of the result. This produces the right-hand element of the list. The left-hand element is produced by the dyad !, which is mod/rotate. In this context mod applies, and Card takes its argument (x) and calculates it mod 13, thus producing an integer in the range 0 thru 12.
To be precise, Card of a scalar produces a two-element vector representing the denomination and suit. Card of a vector of n cards produces a (2 x x) matrix. The first row of this matrix is the denominations, and the second row is the suits.
This is a good time to introduce the monadic +, known as flip. flip transposes any matrix that it is applied to. If we produce a vector of cards with Card we will get a two row matrix representing the cards. If we would prefer an n row matrix of two columns we can just use flip to get it. In this problem domain we generally prefer the 2 row representation because most of our functions operate on denomination and suit separately.
The deal function makes use of flip as well as dyadic # and the function _draw. In addition the adverb ' is used. Let's describe what is going on. The core of the definition of deal is the _draw function, and some comments are in order.
First, `built-in' functions in K generally have names that begin with an underscore _. Thus when we see something like _draw we know to look to K documentation to find the meaning, not to other parts of the current program. The (terse) definition of _draw is that x _draw y `draws' x random elements from 0 ... y-1. It also says that if the right-hand argument is negative then the draw is `without replacement'. Thus 5 _draw -52 will draw a five-card hand from a full deck.
Second, the dyadic # is called take/reshape. For scalar arguments, as used here, x # -52, for example, produces a vector of x -52s.
Finally, in the deal function we now have x 5s to the left of _draw and x -52s to the right of it. The adverb ' which stands for EACH simply tells K to apply the _draw function to EACH element of the left and right arguments, stepping through both vectors in EACH step. In this particular case that produces x hands each containing 5 cards.
Perhaps it is clear by now, given the number of words written in the description, just why K is such a powerful language. It allows us to state very tersely what might otherwise be quite long discourse.
CdNms is just data. It gives `names' to the denominations (in the first row) and suits (in the second row). This also emphasises the notion that in K `matricies' can be `ragged'. Perhaps a better way to say it is that in K everything is scalar, list or list of lists, with a special provision for handling lists that happen to be rectangular (for example, you can't flip a list of lists unless the underlying list is rectangular).
Finally, ShowHand takes the CdNms data and applies it in a convenient way to a hand, or list of hands, to produce readable output. Later we'll tackle the task of explaining each step in the ShowHand function, but first it is time to get some experience actually using the functions.
deals 10 hands and stores them in the matrix Hands. It is not very illuminating to look at the hands in their `native' form, as perhaps a glance at the first three (more would just be more boring) will prove:
This example relies the monadic !, known as enumerate. In this context !3 produces 0 1 2, so Hands[!3] enumerates the first 3 elements of the array Hands.
ShowHand allows a more convienent view.
Which is a fairly comfortable way to look at the hands. The fastidious may want to look at the first three hands to verify that they are the same as is indicated in their numerical representation above.
Key to our evaluation of Poker hands is the Cnt function:
This function, as is typical perhaps, of K does an amazing amount of work in a very terse statement. Stated in words, the Cnt function takes a hand and tallies the number of each denomination which happen to appear in the hand. This tally is presented in a 13-vector which indicates the number of (deuces, treys, ..., kings, aces) that appear in the hand.
Let's look at our first hand and its Cnt:
The Cnt function's result essentially says `There are no deuces, two treys, one four, ..., one ten, ... one king, and no aces in the hand. Cnt can also be applied to a vector of n hands in which case it will produce an n by 13 matrix as a result.
Explaining Cnt is challenging, but possible. It is yet another illustration of just how terse K can be. To read it we have to understand that the steps in a function are, thank goodness for sanity, executed left to right order, even while the steps themselves perform their operations in right-to-left.
First, the local variable N is given a vector of 13 zeros. Second, the form N[x[0]]+:1 accomplishes selecting the denominations from the hand (via x[0], which is the denominations of x, as opposed to x[1], which would be the suits of the cards. Each element of N is bumped (the +) by 1 for each value of x[0] that is encountered.
Finally, the :N indicates that N should be returned as the `value' of the function. When we apply the Cnt function we use Cnt' if we want to apply it to a list of hands in order to apply it to each of the hands in the list.
Now to relate this to poker. Think, for a minute, about the Cnt of some hand. Also, let's not worry the `straight' and `flush' properties of hands yet. If we sort the Cnt of a hand we'll obtain a 13 element vector with at most 5 non-zero elements, all of which will appear at the right. By this point only relatively few possibilities remain:
The ultra-observant will probably object to the appearance of 0 0 0 0 5 in that no simple poker hand can, with a legal deck, produce a hand which has five of anything. However, to leave open the possibility of later introducing wild-cards, where `5 of a kind' becomes a possibility, so we'll incorporate it from the beginning.
The rest of the patterns represent some of the major states of a poker hand. For example, 0 0 1 2 2 represents a hand with two pair while 0 0 0 1 4 would occur for any hand that had four of a kind.
Let's use some K to accomplish this. To do it we'll need some K operators. First, monadic <, called upgrade performs an ascending sort. It is important to understand that sort functions in K (and in most well designed systems) produce results that are the indices of the argument arranged in an order such that if they are applied to the original data, it will be sorted. This allows the indices to be used to order other arrays as well as the key array.
So here we first get a hand on the count vectors by calculating:
So that we can use <:' Counts to order each of the rows. The <: tells K to use monadic < and the ' causes it to be applied to each row, not to the overall matrix). This can then be applied to the original data, and we can take the 5 right-most elements of each row with -5#'Counts @' <:' Counts.
This indicates that in six of the hands, all of the cards are of different denomination. We don't know, at this point, if any of these are straights or flushes, but otherwise they are uninteresting hands. Of the remaining four hands, three (Hands[0 3 9]) contain one pair, and Hands[8] has two pair. Now all we have to do is handle the straights and flushes and we'll pretty much have solved the problem of classifying the hands.
We can handle straights by creating a table of what kinds of counts hands with straights would have. These are mostly pretty simple as nine of the patterns are simply those with Counts where there are five ones in sequence. In K we can obtain these by creating a 9 by 13 array where each of the rows is 5 ones and 8 zeros, properly rotated. A cute trick makes this easy. Produce a 14 element vector consisting of 5 ones and 9 zeros. If this is used to produce a 9 by 13 array, there will be a `wrap-around' effect that will cause each successive row to be shifted one column to the right.
To this we must append one further special pattern to represent the A, 2, 3, 4, 5 straight. This is done explicitly.
To recapitulate, lets establish the following data:
Notice that there are ten different levels of poker hands (if we count five of a kind as a possibility). Let's define the Categ of a hand as follows:
| Name | Value |
| No Pair | 0 |
| Pair | 1 |
| Two Pair | 2 |
| Three of a Kind | 3 |
Continuing, it would make sense to assign straights and flushes to continue the sequence:
| Name | Value |
| Straight | 4 |
| Flush | 5 |
| Full House | 6 |
| Four of a Kind | 7 |
| Straight Flush | 8 |
| Five of a Kind | 9 |
We are left with only one, fortunately simple, computational problem. If we increment the value of hands that contain straights by 4 and those that contain flushes by 5, we are OK with respect all initial rankings (no hand containing a straight or a flush can be anything other than `no pair' otherwise). However, any hand that happened to be a straight flush would obtain a ranking of 9, thus conflicting with five of a kind. To avoid this problem, we can initially rank five of a kind at 10, and then after marking straights and flushes reduce all marks that are greater than 8 by one thus producing the desired values.
First, let's give each of the hands in the argument the initial categories as suggested above:
This classifies all non-straight, non-flush, non five-of-a kind hands properly. It relies on a very nice table lookup function available in K which is ?/: which essentially does a table lookup (?) of each element of the right argument (/:) and then uses the result as indices in an initial value table (0 1 2 3 6 7 10) that assigns the proper initial values.
We again use table lookup (?/:) to find any of the Counts that happen to match the straights patterned in Str. Since there are 10 different patterns of straights (remember they will be numbered 0 thru 9), anytime the `match' is with `line 10' it means we didn't have a match. This is tested by the 10 = and if we negate this test (~) we end up with the truth vector of `is a straight'. The monadic & is known as where, and this converts any truth vector into a list of indices where the vector is true. For these places we increment the Categ by 4.
Handling flushes is even simpler. Here we simply count the number of suits in each hand and increment the count by 5 wherever there is only one suit in the hand. The code here relies on monadic =, known as group to take the suits information (x[1]) and if there is only 1 `group' then the hand is a flush.
Then we make the final adjustment by decrementing each of the places where the Categ is greater than 8 by one. This completes the task of basic categorization.
At this point, all of the hands will be properly categorized, but, due to the fact that there can be ties in cagegory, the hands are not properly ranked yet.
Again, a simple rule almost handles the ranking problem very simply. Let's look at a simple concept: the denomination of the cards that comprise the elements that make up the Counts. First, we recognize that ranking by category is absolute in the sense that all hands of a higher category rank above any hand of a lower category. So all we have to do to resolve all ranking problems is to rank the hands that happen to fall into the same category.
We do not go far wrong if we define a card's importance by its role in the hand. For example, in a full-house, the three-of-a-kind is more important than the pair. Fully ranking the hands is (almost) saying that poker hands, within category, are ranked by comparing the denominations of the cards in the hand when considered by importance.
Thus, in four-of-a-kind, all that matters is the denomination of the four cards. 4 fours and a deuce beat 4 threes and an ace. With a full-house, the denomination of the three-of-a-kind determines the outcome. When we get to two pair, it is the highest pair that determines the outcome, unless both hands contain this pair. For example, a pair of deuces and a pair of aces beat a pair of queens and a pair of kings.
Whenever there is a tie for highest in the important card, the next most important card then determines. And so on. In the final analysis, for example, two no-pair hands are ranked by the highest card. If both hands contain this card, then the second highest card determines the outcome. If both hands contain the same two highest cards, the third highest determines, etc.
These rules work for straights and flushes as well, with one exception. Hands that contain A2345 straights are regarded as five high, and thus they lose to 23456 straights, for example, even though they contain an ace. The essential notion here is that in this particular case the ace can be treated a `low', not that `wrap-around' straights are allowed.
Managing the coding of all of this isn't actually very hard. If we take <:'Counts and monadic |, called reverse, which we achieve with (|:) generates the indices so we get just what we need: namely a list of the denominations of the cards, ranked by importance. In the implementation of the rank function that brings all of this work together we create a characteristic vector for each hand that combines Categ information with the denomination ranking to produce a ChrVec that properly ranks all hands, i.e. if a hand X beats hand Y, the rank of X will be greater than the rank of Y. We state, but don't prove here, that this relationship is, fortunately for all real poker players, transitive.
This does the work:
A couple of functions are useful for interacting with the programs that we have been discussing. Rsl tallies the results of running a number of hands. Show can be used to produce a rather clean little table of results, along with the names of the various categories.
On an 800Mhz Pentium III with 512m of memory, simulations proceed at the rates indicated in the following summary:
| Hands | Deal | Tabulate |
| 10000 | 0.219 sec | 0.499 sec |
| 20000 | 0.330 sec | 0.929 sec |
| 30000 | 0.489 sec | 1.319 sec |
| 40000 | 0.769 sec | 1.639 sec |
| 50000 | 0.880 sec | 1.869 sec |
| 60000 | 1.159 sec | 2.200 sec |
| 70000 | 1.309 sec | 2.359 sec |
| 80000 | 1.430 sec | 2.800 sec |
| 90000 | 1.600 sec | 2.960 sec |
| 100000 | 1.700 sec | 4.670 sec |
| 200000 | 2.750 sec | 6.430 sec |
| 500000 | 5.550 sec | permission |
The `permission error' on evaluating half a million simulations is not a problem with either K or with the code. The free version of K has some restrictions built in, and a simulation of this size apparently crosses one of these limits so the system signals as `error' to indicate that you are not permitted to use K to solve problems of this size (for free, that is). Also, to be completely fair to K it should probably be noted that most systems would have real trouble with problems even in the 200,000 range.
Early in this paper we used, but did not describe the detailed funtion of the ShowHand function. Now that we have more understanding of K we can describe this function in detail, perhaps with some useful point. First a reminder of the function:
The core function here is CdNms @' x which causes each element in Row 0 of x to be replaced by the corresponding entry of the elements in CdNms, the denomination names. Similarly each element in Row 1 is replaced by the names of the suits. Thus we might end up with something like:
Which is, as they say, `a start'. Next, this result is flipped, with the + into something that looks like:
Now, were ready for ,\:" " which essentially iterates the ," " operation across each element of the left argument, effectively placing a blank on the right end of each string. Finally, the ,/ operator places a dyadic join between each element, reducing the five rows to a single string:
There are currently two other papers in this series. One of the papers is sibling of this paper, except that it uses the programming language J as an example rather than K. The other paper treats some of the emerging `issues' that have surfaced in the process of developing this material. It is called `From Cards to Near Poker to Poker'. These papers can be found through the Home Page referenced at the top of this document, all under the general category of `Cards, Poker, J and K'.
| David Ness Date: 2001/07/11 0517 UTC |
|