Prime pair sets

I started playing around with Project Euler way back in November 2011, not least as an opportunity to hone my still nascent Python skills. And I’m still learning.

Problem 60 states:

The primes 3, 7, 109, and 673, are quite remarkable. By taking any two primes and concatenating them in any order the result will always be prime. For example, taking 7 and 109, both 7109 and 1097 are prime. The sum of these four primes, 792, represents the lowest sum for a set of four primes with this property.

Find the lowest sum for a set of five primes for which any two primes concatenate to produce another prime.

This took a bit of thinking about but the approach I eventually came up with was to create a set of concetanatable primes for each prime (up to an arbitarily chosen value of 10000). Then all I would need to do is search each set for intersecting elements until I found five intersecting sets. Since the primes are generated in order, from lowest to highest, the first set of five intersecting sets will give me the answer.

Then for the implementation…

I already have a function that returns a list of primes, so that was easy, and using a dictionary keyed by prime was a no-brainer, but looking for set intersections was a bit of a struggle. Until I discovered that Python will do it for me. Using the set class it becomes remarkably easy to build a dictionary of sets and then work through them, checking intersections, until I find five sets that intersect.

My actual code is neither nice nor fast, and it isn’t going to scale at all – but I am quite pleased to have gained a handle on yet another iterable type.

Comparing exponentials with logarithms

Project Euler problem 99 is as follows:

Comparing two numbers written in index form like 211 and 37 is not difficult, as any calculator would confirm that 211 = 2048 < 37 = 2187.

However, confirming that 632382518061 < 519432525806 would be much more difficult, as both numbers contain over three million digits.

Using base_exp.txt, a 22K text file containing one thousand lines with a base/exponent pair on each line, determine which line number has the greatest numerical value.

This had me stumped for a while because, as far as I could find, there is simply no realistic way of calculating the large exponentials in the list. And then I learned something:

log(xy) = ylog(x)

And once you know that, finding the solution is very easy indeed.

Dice combinations

I’m always a little uncertain as to how far I can rasonably talk about Project Euler problems. The site is a superb resource that challenges you to develop programs to solve mathematical problems. For myself, it has helped both to improve my Python programming skills and to broaden my appreciation of just how diverse and fascinating a field Mathematics is.

However, the site does ask you not to share solutions and I can see the reason for this, If you can just copy and past the answer to a problem, then it all becomes a little pointless.

On the other hand, I do sometimes come up with solution for which I feel (I think) justifiably proud. In the case of Problem 250, the solution I was pleased with is not the solution to the problem, so I feel reasonably justified in talking about it.

So here goes…

Peter has nine four-sided (pyramidal) dice, each with faces numbered 1, 2, 3, 4.
Colin has six six-sided (cubic) dice, each with faces numbered 1, 2, 3, 4, 5, 6.

Peter and Colin roll their dice and compare totals: the highest total wins. The result is a draw if the totals are equal.

What is the probability that Pyramidal Pete beats Cubic Colin? Give your answer rounded to seven decimal places in the form 0.abcdefg

This looked to me like a simple question question of number crunching. If I know all of the combinations and frequencies that six six-sided dice can produce, and all the combinations and frequencies for nine four-sided dice, then I can simply work through every possible dice roll and see who wins.

The fun began when I decided I wanted a generic function that would give me, for any number of a dice, the frequency of every possible combination. It’s taken a while, but the final result turned out to be a lot simpler than I expected.

from itertools import product

def list_dice_combinations(number, dice):
    die = []
    for i in range(1, dice + 1):
        die.append(i)

    combinations = {}
    for roll in product(die, repeat = number):
        total = sum(roll)
        if total in combinations:
            combinations[total] += 1
        else:
            combinations[total] = 1
    return(combinations)

How you use this function is, of course, completely up to you.

Fifty solutions in sixteen months

Observant readers (both of you) will notice a graphic has just popped up in the sidebar of this blog. This should update automatically to reflect my Project Euler progress.

And, yes, after 16 months I have finally solved the first 50 problems.

I have to admit that some of my solutions are a little slower than they should be, but I am quite pleased to have made it this far and it is quite nice to be able to see the extent to which my Python is improving.

Yielding pandigitals

An n-digit number is pandigital if it makes use of all the digits 1 to n exactly once; for example, the 5-digit number, 15234, is 1 through 5 pandigital. This is a thing that I first encountered while solving Project Euler problems and it keeps on turning up – most recently (for me) in problem 43.

The number, 1406357289, is a 0 to 9 pandigital number because it is made up of each of the digits 0 to 9 in some order, but it also has a rather interesting sub-string divisibility property.

Let d1 be the 1st digit, d2 be the 2nd digit, and so on. In this way, we note the following:

d2d3d4=406 is divisible by 2
d3d4d5=063 is divisible by 3
d4d5d6=635 is divisible by 5
d5d6d7=357 is divisible by 7
d6d7d8=572 is divisible by 11
d7d8d9=728 is divisible by 13
d8d9d10=289 is divisible by 17

Find the sum of all 0 to 9 pandigital numbers with this property.

My eventual approach was (mainly) to brute-force my way through all of the possible pandigitals and testing each against the divisible criteria listed. This, however, raised its own problem: How do I quickly generate a list of all nine digit pandigitals?

The answer turned out to be a rather elegant application of the itertools module.

#!/usr/bin/env python

import itertools

def yield_pandigitals(length):
    """ Yields all 0 to length pandigitals 
        length is an integer between 1 and 9 """ 
    for pandigital in list(itertools.permutations(range(length + 1))):
        yield int(''.join(map(str, pandigital)))

Well, I was rather pleased with it.

Desperately seeking Pythagoras

Also known as Project Euler problem 39:

If p is the perimeter of a right angle triangle with integral length sides, {a,b,c}, there are exactly three solutions for p = 120.

{20,48,52}, {24,45,51}, {30,40,50}

For which value of p ≤ 1000, is the number of solutions maximised?

My first thought here was to generate an iterable collection of right angled triangles and accumulate a count of triangles by perimeter size. Don’t try this – it is far too slow an approach.

So my second thought was to try and iterate over the perimeter sizes and calculate the number of right angle triangles for each. It’s a long time since I had to think about Pythagoras but, for a triangle with sides a, b and c and a perimeter of p, I can make the following two statements:

a + b + c = p
a^2 + b^2 = c^2

Based on the second statement, above, I can also assert:
c = sqrt(a^2 + b^2)

Also:
c = p - a - b

Which means:
a^2 + b^2 = (p - a - b)^2

This can be expanded and simplified to state:
0 = pp - 2pb - 2ap + 2ab

or:

b = p(p - 2a) / 2(p - a)

Using this formula, we can say that for any values of p and a where p > a, if b is an integer then we have a right angle triangle.

One further optimisation: Intuitively, a can’t be greater than p/3.

Implementing this can then be done with two very simple for loops. The outer one iterates over all values of p and the inner one iterates over the range (2, int(p / 3)) and counts the number of integer values of b.

Desperately seeking pandigitals

Also known as Project Euler Problem 38:

Take the number 192 and multiply it by each of 1, 2, and 3:

192 × 1 = 192
192 × 2 = 384
192 × 3 = 576

By concatenating each product we get the 1 to 9 pandigital, 192384576. We will call 192384576 the concatenated product of 192 and (1,2,3)

The same can be achieved by starting with 9 and multiplying by 1, 2, 3, 4, and 5, giving the pandigital, 918273645, which is the concatenated product of 9 and (1,2,3,4,5).

What is the largest 1 to 9 pandigital 9-digit number that can be formed as the concatenated product of an integer with (1,2, … , n) where n > 1?

This is a problem for which finding a practical approach took a lot longer than the implementation of said approach. It’s also a problem that forced me to think about the actual maths of the problem. I already have an is_pandigital function, which I wrote to help solve problem 32, so the main challenge here is to constrain the search enough that I can avoid endless looping.

So, a bit of analysis…

For n = 2,
9 * (1, 2) = 918
98 * (1, 2) = 98196
987 * (1, 2) = 9871974
9876 * (1, 2) = 987619752 <- That's 9 digits

Since the problem mentions the pandigital, 918273645, I plugged this into the calculator as well,
9182 * (1, 2) = 918218365 <- That's 9 digits again

For n = 3,
9 * (1, 2, 3) = 91827
98 * (1, 2, 3) = 98196294 <- 8 digits is not enough
987 * (1, 2, 3) = 98719742961 <- 11 digits is too many

So I now know that the number I am looking for is n * (1, 2) where n is between 9876 and 9182.

The implementation, therefore, is a very simple loop that checks is_pandigital(int(str(n) + str(n * 2))) for values of n from 9876 to 9182.

The result is almost instantaneous.

Fun with Factorions

Also known as Project Euler problem 34:

145 is a curious number, as 1! + 4! + 5! = 1 + 24 + 120 = 145.

Find the sum of all numbers which are equal to the sum of the factorial of their digits.

Note: as 1! = 1 and 2! = 2 are not sums they are not included.

According to Wikipedia, A factorion is a natural number that equals the sum of the factorials of its decimal digits. So now we know what we are looking at. Wikipedia also tells us how many decimal factorians exists, but I wrote some code anyway.

The first part of this is to determine whether a given number is a factorian:

import math

def is_factorion(integer):
    """ Returns True if integer is a factorion, else False
        See problem 34 for a more detailed explanation """
    digit_sum = 0
    for n in str(integer):
        digit_sum += math.factorial(int(n))
    if digit_sum == integer:
        return True
    else:
        return False

And then for the reason I found myself googling this: What is the range of numbers I need to check?

It turns out that:

If n is a natural number of d digits that is a factorion, then 10d − 1 ≤ n ≤ 9!d. This fails to hold for d ≥ 8 thus n has at most 7 digits, and the first upper bound is 9,999,999. But the maximum sum of factorials of digits for a 7 digit number is 9!7 = 2,540,160 establishing the second upper bound.

Problem 34 tells us to ignore 1 and 2 so the solution can be derived by simply summing each number between 3 and 2540160 for which is_factorion is True.

Find the sum of all numbers that can be written as pandigital products.

Also known as Project Euler problem 32. Yes, I am still slowly working my way through these and – in this case – I am quite pleased with myself for having a solution, not least because I had to read the problem twice before I actually understood it.

The problem is this:

We shall say that an n-digit number is pandigital if it makes use of all the digits 1 to n exactly once; for example, the 5-digit number, 15234, is 1 through 5 pandigital.

The product 7254 is unusual, as the identity, 39 186 = 7254, containing multiplicand, multiplier, and product is 1 through 9 pandigital.

Find the sum of all products whose multiplicand/multiplier/product identity can be written as a 1 through 9 pandigital.

HINT: Some products can be obtained in more than one way so be sure to only include it once in your sum.

That hint turns out to be useful.

My first thought was that I need a function to identify whether a number is pandigital. This turned out to be easier than I thought and I quickly came up with this:

def is_pandigital(integer):
    """ Returns True if ingeger is pandigital, else false
        A pandigital integer of length n is one that contains all
        digits from 1 to n exactly once
        integer is (obviously) an integer """
    text = list(str(integer))
    text.sort()
    for n in range (0, len(text)):
        if str(n + 1) != text[n]:
            return False
    return True

Now that I can test for pandigitality (is that a word?) I need to figure out what numbers to test. It was at this point that I realised that I already had a list_divisors function, written to solve an earlier problem. Handy, and a bit of trial and error established that I only need to check for four digit products if I want the product and its divisors to total 9 digits in length.

So…

from euler import is_pandigital, list_divisors

products = []

for n in range(1111, 9999):
    divisors = list_divisors(n)
    for m1 in divisors:
        m2 = n / m1
        as_text = str(n) + str(m1) + str(m2)
        if len(as_text) == 9:
            if is_pandigital(int(as_text)) == True:
                products.append(n)
                break

print reduce(lambda x, y: x+y, products)

The hint helps because once I have established that a product and its divisors amount to a pandigital number, I can stop checking (hence the break).

This is probably not the fastest solution available, but it works and it’s very easy to understand.

Project Euler: The Journey Begins

I haven’t mentioned Project Euler for a while, but I haven’t been ignoring the problems either. What I have been doing is building a library of reusable functions.

I noticed that a number of the problems are variations on previous solutions and was starting to find that I was spending more time finding code to copy and paste from than actually solving the problems. Having everything logically arranged in a single library should make this a lot easier.

It took a while, but seems to be paying off as I have now joined the 17.92% of people that have made it to Level 1.