Counting

Pigeon-hole principle

Pigeon-hole principle: No matter how n items are distributed among k holes, there must be at least ⌈n/k⌉ in some hole.

Proof by contradiction: Assume false, i.e., nk⋅(⌈n/k⌉-1). By algebra, that is strictly less than k⋅(n/k), since ⌈n/k⌉ < n/k+1. That is simply n. So, we have a contradiction, since n can't be less than itself.


Some Introductory Examples

How many ways to choose a password that is…

Calculate each of these numerically, to get an idea of scale.

How long to crack by brute-force, if each guess takes 10-7sec? (Note: 1 day is approximately 10000 sec.)

In the movies, you often see the good/evil hacker, with their display whirring through all possible inputs keys, gradually finding each password letter-by-letter. This only works if you can somehow determine something like "aha, after trying some inputs, I know the first letter of the password must be 'K'". How long to crack under this assumption?


Permutations

How many ways to order n items? n × (n-1) × … 2 × 1 = n!.

How many ways to order k of n items? P(n,k) = n × (n-1) × … (n-k+2) × (n-k+1) = n!/(n-k)! Alternate notation: Pnk.

Assumes choices are "independent".

Example: In a LAN with n nodes, How many sender/receiver possibilities, (if you can't send to yourself?) I.e., how many connections. If each message is sent to one other member (to ensure no personal mail?), how many sender/receiver/verifier arrangements?


Combinations

How many ways to choose k of n items, when order doesn't matter:

C(n,k)= n!/(k!(n-k)!). Note that C(n,k) = C(n,n-k).

Alternate notation: Cnk or
( n )
k

Examples:


Pascal's triangle

Pascal's triangle is a fun tool for exploring facts about combinations. The following is only a sample.

Pascal's triangle

Claim: Each item corresponds to C(row#,item#) with 0-based numbering.

Claim: Each row is a power of two.

Claim: Each row gives constant factors in expansion of (a+b)n. "Binomial theorem."


Problem:

A committee is being formed of 3 electrical engineers, 2 computer scientists, and 1 applied math undergrad — a rough proportion to these majors' enrollments. Each will be assigned one of the committee's six roles (chair, secretary, …). How many ways can the degrees be represented by the committee roles?

Motivating examples:

Consider the roles to be ordered:

EEECCM
EEECMC
EEEMCC
EECECM
EECCEM
…

I.e., how many 6-character strings can be made out of 3 E's, 2 C's, and 1 M.

Solution:

Imagine all letters are distinguished (not just two "C"s, but "C1" and "C2"); there are clearly 6! possibilities. But for every entry we can match it with the swapped-C version: each pair is the same, but our 6! is counting it twice. Divide by 2, to compensate for over-counting the C's. Even after this, we see that for every word, we can re-arrage the three E's in any of 3!=6 ways; thus every entry is part of a group of 6 identical strings, and we count all 6 instead of just one. Thus our final answer is 6!/(3!2!1!) = 60.

Generalizing:

Out of n items, if you want to choose k1 of type 1, k2 of type 2, …, ki of type i, (where k1 + k2 + … + ki = n),
then there are n/(k1! ⋅ k2! ⋅ … ⋅ ki!) ways of doing so. I.e., this is the number of permutations of n objects where there are groups of k1, …, ki equivalent objects.


Problem:

Leebron decrees that each college will have a new computer lab. Each gets n=5 computers, each from one of r=3 vendors: Sun, HP, and Apple. Of course, each college wants to be unique and to have a different lab setup than all other colleges. How many colleges can all have a different setup? We don't care where in the lab they're placed.

Examples:

Start to enumerate:

SSSSS
SSSSH
SSSSA
SSSHH
SSSHA
SSSAA
SSHHH
SSHHA
SSHAA
SSAAA
…

Can we detect a pattern? Especially one that will generalize? (The consistent ordering of this enumeration may help.)

Solution:

Let's rewrite our enumeration, adding a divider between each item type:

SSSSS||
SSSS|H|
SSSS||A
SSS|HH|
SSS|H|A
SSS||AA
SS|HHH|
SS|HH|A
SS|H|AA
SS||AAA
…

How many "dividers" are there? r-1=2

Now, let's focus only on the dividers and note that we haven't lost any info:

_ _ _ _ _ | | 
_ _ _ _ | _ | 
_ _ _ _ | | _ 
_ _ _ | _ _ | 
_ _ _ | _ | _ 
_ _ _ | | _ _ 
_ _ | _ _ _ | 
_ _ | _ _ | _ 
_ _ | _ | _ _ 
_ _ | | _ _ _ 
…

How long is each string? n+r-1=7

Want to place r-1 dividers in any of n+r-1 positions — there are C(n+r-1,r-1) ways to do so.

Also, note that C(n+r-1,r-1) = C(n+r-1,n). This is the number of combinations of n objects, where are there are r types of equivalent objects.


Problem:

What if we do care where in the lab they're placed, how many possible labs are there?

Solution:

How many options for position 1? r

How many options for position i? r

Total: rn


Problem:

Let A be the set of Freshmen, B the set of Hanszenites, and C the set of CS majors. What is the size of ABC?

Solution:

|A|+|B|+|C| is an approximation.

How can we improve upon this? Don't count duplicates. E.g., BC are counted twice, so subtract them out.

Our new estimate is |A|+|B|+|C| - (|AB| + |AC| + |BC|).

Are we done? No. This now correctly counts people in just one set, and correctly counts people in exactly two, but the people in all three were added three times were added three times then subtracted three times. We must add them in once more.

Our new estimate is |A|+|B|+|C| - (|AB| + |AC| + |BC|) + |ABC|

Are we done? Yes.

Generalizing:

Note that the number of terms in our calculation:

The pattern generalizes, e.g., to calculate the size of the union of four sets, it's