01/28/22

Exploring Wordle with Julia: Part I

Wordle is a fun and easy game that has shot to fame recently. It is completely minimalistic, no-fills and highly addictive. We will explore here how to write as simple solver for Wordle with Julia and then adding layers of complexity and sophistication as we go along.

In Wordle we are successively guessing five-letter words. After each guess the game tells us which of our letters occur in the secret word and whether they are in the correct position. As explained in their website.

So after each guess we narrow the set of possible words to choose from till we arrive at the secret word, and we only have six tries.

Our Dictionary

Wordle only deals with a valid five-letter English words. So we need to start grabbing such words. A good place to start is the Standford GraphBase data set.

https://gist.github.com/mbeltagy/51e4672379a71e5d73c789e6106bc296#file-word_dict-jl

Constraints

The next step is the try narrow down the set of feasible words as we are getting clues. Towards that end, we construct a Dict to map letters that are in the word at the correct spot to their location, another Dict to maps letter in the word that are in the wrong spot, and finally a Set of letters that are not in the word:

https://gist.github.com/mbeltagy/51e4672379a71e5d73c789e6106bc296#file-constraint_definition-jl

Now we can encode the the response we get the game as String of numbers, where 0 means that the character is not in the word, 1 means the that character is the word but not in the right spot, and 2 means that the character is in the word at the correct spot. For example

Secret Word:    perky
Candidate Word: ebony
Response:       10002

We can update our set of constraint for any response by the following function:

https://gist.github.com/mbeltagy/51e4672379a71e5d73c789e6106bc296#file-update_constraints_by_response-jl

Shrinking the feasible set of words

The constraints will reduce the feasible set of possible words. This can be done by applying filters that make use of constraints as in the following function:

https://gist.github.com/mbeltagy/51e4672379a71e5d73c789e6106bc296#file-word_set_reduction-jl

Putting it all together

So all what we have to do now is to apply the following step

  1. Guess a candidate first word
  2. Get response from the Game
  3. Use that response to update the set of constraints
  4. Use the constraints to reduce the set of possible words
  5. Guess the new candidate word
  6. Repeat step 2 until a word is found or the game is over

In the steps 1 and 5 above, the simplest approach is just to pick a word at random. Below is Julia repl session showing how we go about applying the above. The secret word here is “super”

https://gist.github.com/mbeltagy/51e4672379a71e5d73c789e6106bc296#file-blog_julia_wordle_repl_session1-jl

We see that we arrived at the correct word after guessing

  • excel
  • buyer
  • fumer
  • purer
  • super

That is a 5/6 score. Not too bad! But we could do better. How? We explore that in the next part.

See also below an interactive session where the secret word is “wrung”

01/28/22

Expanding WordPress Julia’s GeSHi Syntax Highlighting

In your WordPress installation you want to modify the wp-content/plugins/wp-geshi-highlight/geshi/geshi/julia.php file.

/*
** builtins
*/
2 => array(
    'Array', 'String', 'Bool', 'Number', 'Int', 'Integer', 'Real', 'Complex',
    'FloatingPoint', 'Float64', 'Float32', 'Int8', 'Int16', 'Int32', 'Int64',
    'Rational', 'AbstractArray', 'Unsigned', 'Signed', 'Uint', 'Uint8', 'Uint16',
    'Uint32', 'Uint64', 'Vector', 'AbstractVector', 'Matrix', 'AbstractMatrix',
    'Type', 'IO',....

You are trying to expand the list of those builtins. In Julia, you extract that list by running

julia> [names(Core);names(Base)] .|> String |> x->filter(z->occursin(r"^[A-Za-z]+[A-Za-z0-9]+",z),x) |> x->join(["'$y'" for y in x],", ") |> x-> replace(x,r"(.{30,75},)\s"=>SubstitutionString(" "^12*"\\1\\n"))|> clipboard

You take the output in the clipboard and replace the contents of the aforementioned array.

11/9/19

Evaluating Arbitrary Precision Integer Expressions in Julia using Metaprogramming

While watching the Mathologer masterclass on power sums

I came across a challenge to evaluate the following sum
\(1^{10}+2^{10}+\cdots+1000^{10}\)

This can be easily evaluated via brute force in Julia to yield

function power_cal(n)
  a=big(0)
  for i=1:n
    a+=big(i)^10
  end
  a 
end
 
julia> power_cal(1000)
91409924241424243424241924242500

Note the I had to use big to makes sure we are using BigInt in the computation. Without that, we would be quickly running in into an overflow issue and we will get a very wrong number.

In the comment section of the video I found a very elegant solution to the above sum, expressed as

(1/11) * 1000^11 + (1/2) * 1000^10 + (5/6) * 1000^9 – 1000^7 + 1000^5-(1/2) * 1000^3 + (5/66) * 1000 = 91409924241424243424241924242500

If I try to plug this into the Julia, I get

julia> (1/11) * 1000^11 + (1/2) * 1000^10 + (5/6) * 1000^9 - 1000^7 + 1000^5-(1/2) * 1000^3 + (5/66) * 1000
-6.740310541071357e18

This negative answer is not surprising at all, because we obviously ran into an overflow. We can, of course, go through that expression and modify all instances of Int64 with BigInt by wrapping it in the big function. But that would be cumbersome to do by hand.

The power of Metaprogramming

In Julia, metaprogramming allows you to write code that creates code, the idea here to manipulate the abstract syntax tree (AST) of a Julia expression. We start to by “quoting” our original mathematical expressing into a Julia expression. In the at form it is not evaluated yet, however we can always evaluate it via eval.

julia> ex1=:((1/11) * 1000^11 + (1/2) * 1000^10 + (5/6) * 1000^9 - 1000^7 + 1000^5-(1/2) * 1000^3 + (5/66) * 1000)
 
:((((((1 / 11) * 1000 ^ 11 + (1 / 2) * 1000 ^ 10 + (5 / 6) * 1000 ^ 9) - 1000 ^ 7) + 1000 ^ 5) - (1 / 2) * 1000 ^ 3) + (5 / 66) * 1000)
 
julia> dump(ex1)
Expr
  head: Symbol call
  args: Array{Any}((3,))
    1: Symbol +
    2: Expr
      head: Symbol call
      args: Array{Any}((3,))
        1: Symbol -
        2: Expr
          head: Symbol call
          args: Array{Any}((3,))
            1: Symbol +
            2: Expr
              head: Symbol call
              args: Array{Any}((3,))
                1: Symbol -
                2: Expr
                3: Expr
            3: Expr
              head: Symbol call
              args: Array{Any}((3,))
                1: Symbol ^
                2: Int64 1000
                3: Int64 5
 
 
julia> eval(ex1)
-6.740310541071357e18

The output of dump show the full AST in all its glory (…well almost the depth is a bit truncated). Notice that here all our numbers are interpreted as Int64.
Now we walk through the AST and change all occurrences of Int64 with BigInt by using the big function.

function makeIntBig!(ex::Expr)
  args=ex.args
  for i in eachindex(args)
       if args[i] isa Int64
          args[i]=big(args[i])
       end
       if args[i] isa Expr
          makeIntBig!(args[i])
       end
   end
end
 
julia> ex2=copy(ex1)
:((((((1 / 11) * 1000 ^ 11 + (1 / 2) * 1000 ^ 10 + (5 / 6) * 1000 ^ 9) - 1000 ^ 7) + 1000 ^ 5) - (1 / 2) * 1000 ^ 3) + (5 / 66) * 1000)
 
julia> makeIntBig!(ex2)
 
julia> eval(ex2)
9.14099242414242434242419242425000000000000000000000000000000000000000000000014e+31

We see an improvement here, but the results are not very satisfactory. The divisions yield BigFloat results, which had a tiny bit of floating point errors. Can we do better?
Julia has support for Rational expressions baked in. We can use that improve the results. We just need to search for call expressions the / symbol and replace it by the // symbol. For safety we just have to makes sure the operands are as subtype of Integer.

function makeIntBig!(ex::Expr)
    args=ex.args
    if ex.head == :call && args[1]==:/ && 
              length(args)==3 && 
              all(x->typeof(x) <: Integer,args[2:end]) 
        args[1]=://
        args[2]=big(args[2])
        args[3]=big(args[3])
    else 
        for i in eachindex(args)
           if args[i] isa Int64
              args[i]=big(args[i])
           end
           if args[i] isa Expr
              makeIntBig!(args[i])
           end
        end
    end
end
 
julia> ex2=copy(ex1);
 
julia> makeIntBig!(ex2)
 
julia> eval(ex2)
91409924241424243424241924242500//1

Now that is much better! We have not lost any precision and we ended us with a Rational expression.

Finally, we can build a macro so the if we run into such expressions in the future and we want to evaluate them, we could just conveniently call it.

macro eval_bigInt(ex)
   makeIntBig!(ex)
   quote # Removing the denominator if it is redundant 
      local x=$ex
      (x isa Rational && x.den==1) ? x.num : x 
   end
end

and we can now simply evaluate our original expression as

julia> @eval_bigInt (1/11) * 1000^11 + (1/2) * 1000^10 + (5/6) * 1000^9 - 1000^7 + 1000^5-(1/2) * 1000^3 + (5/66) * 1000
91409924241424243424241924242500

07/28/18

Exploring left truncatable primes

Recently I came across a fascinating Numberphile video on truncatable primes

I immediately thought it would be cool to whip a quick Julia code to get the full enumeration of all left truncatable primes, count the number of branches and also get the largest left truncatable prime.

using Primes
 
function get_left_primes(s::String)
    p_arr=Array{String,1}()
    for i=1:9
        number_s="$i$s"
        if isprime(parse(BigInt, number_s))
            push!(p_arr,number_s)
        end
    end
    p_arr
end
 
function get_all_left_primes(l)
    r_l= Array{String,1}()
    n_end_points=0
    for i in l
        new_l=get_left_primes(i)
        isempty(new_l) && (n_end_points+=1)
        append!(r_l,new_l)
        next_new_l,new_n=get_all_left_primes(new_l)
        n_end_points+=new_n # counting the chains
        append!(r_l,next_new_l)
    end
    r_l, n
end

The first function just prepends a number (expressed in String for convenience) and checks for it possible primes that can emerge from a single digit prepending. For example:

julia> get_left_primes("17")
2-element Array{String,1}:
 "317"
 "617"

The second function, just makes extensive use of the first to get all left truncatable primes and also count the number of branches.

julia> all_left_primes, n_branches=get_all_left_primes([""])
(String["2", "3", "5", "7", "13", "23", "43", "53", "73", "83""6435616333396997", "6633396997", "76633396997", "963396997", "16396997", "96396997", "616396997", "916396997", "396396997", "4396396997"], 1442)
 
julia> n_branches
1442
 
julia> all_left_primes
4260-element Array{String,1}:
 "2"
 "3"
 "5"
 "7"
 "13"
 "23""96396997"
 "616396997"
 "916396997"
 "396396997"
 "4396396997"

So we the full list of possible left truncatable primes with a length 4260. Also the total number of branches came to 1442.

We now get the largest left truncatable primes with the following one liner:

julia> largest_left_prime=length.(all_left_primes)|>indmax|> x->all_left_primes[x]
"357686312646216567629137"

After this fun exploration, I found an implementation in Julia for just getting the largest left truncatable prime for any base in Rosseta Code.

04/1/18

Iterating with Dates and Time in Julia

Julia has good documentation on dealing with Dates and Time, however that is often in the context constructing and Date and Time objects. In this post, I am focus on the ability to iterate over Dates and Times. This is very useful in countless application.

We start of by capturing this moment and moving ahead into the future

julia> using Dates
julia> this_moment=now()
2018-04-01T23:13:33.437

In one hour that will be

julia> this_moment+Dates.Hour(1)
2018-04-02T00:13:33.437

Notice that Julia was clever enough properly interpret that we will be on the in another day after exactly one hour. Thanks to it multiple dispatch of the DateTime type to be able to do TimeType period arithmatic.

You can then write a nice for loop that does something every four hours for the next two days.

julia> for t=this_moment:Dates.Hour(4):this_moment+Dates.Day(2)
    println(t)
    #or somethings special with that time
end
2018-04-01T23:13:33.437
2018-04-02T03:13:33.437
2018-04-02T07:13:33.437
2018-04-02T11:13:33.437
2018-04-02T15:13:33.437
2018-04-02T19:13:33.437
2018-04-02T23:13:33.437
2018-04-03T03:13:33.437
2018-04-03T07:13:33.437
2018-04-03T11:13:33.437
2018-04-03T15:13:33.437
2018-04-03T19:13:33.437
2018-04-03T23:13:33.437

Often we are not so interested in the full dates. For example if we are reading a video file and we want to get a frame every 5 seconds while using VideoIO.jl. We can deal here with the simpler Time type.

julia> video_start=Dates.Time(0,5,20)
00:05:20


Here we are interested in starting 5 minutes and 20 seconds into the video.
Now we can make a nice loop from the start to finish

for t=video_start:Dates.Second(5):video_start+Dates.Hour(2)
    h=Dates.Hour(t).value
    m=Dates.Minute(t).value
    s=Dates.Second(t).value
    ms=Dates.Millisecond(t).value
    # Do something interesting with ffmpeg seek on the video
end

02/9/18

When Julia is faster than C, digging deeper

In my earlier post I showed an example where Julia is significantly faster than c. I got this insightful response

So I decided to dig deeper. Basically the standard c rand() is not that good. So instead I searched for the fastest Mersenne Twister there is. I downloaded the latest code and compiled it in the fastest way for my architecture.

/* eurler2.c */
#include <stdio.h>      /* printf, NULL */
#include <stdlib.h>     /* srand, rand */
#include "SFMT.h"       /* fast Mersenne Twister */
 
sfmt_t sfmt;
 
double r2()
{
    return sfmt_genrand_res53(&sfmt);
 
}
 
double euler(long int n)
{
    long int m=0;
    long int i;
    for(i=0; i<n; i++){
        double the_sum=0;
        while(1) {
            m++;
            the_sum+=r2();
            if(the_sum>1.0) break;
        }
    }
    return (double)m/(double)n;
}
 
 
int main ()
{
  sfmt_init_gen_rand(&sfmt,123456);
  printf ("Euler : %2.5f\n", euler(1000000000));
 
  return 0;
}

I had to compile with a whole bunch of flags which I induced from SFMT‘s Makefile to get faster performance.

gcc -O3 -finline-functions -fomit-frame-pointer -DNDEBUG -fno-strict-aliasing --param max-inline-insns-single=1800  -Wall -std=c99 -msse2 -DHAVE_SSE2 -DSFMT_MEXP=1279 -ISFMT-src-1.5.1 -o eulerfast SFMT.c euler2.c

And after all that trouble we got the performance down to 18 seconds. Still slower that Julia‘s 16 seconds.

$ time ./eulerfast 
Euler : 2.71824
 
real    0m18.075s
user    0m18.085s
sys 0m0.001s

Probably, we could do a bit better with more tweaks, and probably exceed Julia‘s performance with some effort. But at that point, I got tired of pushing this further. The thing I love about Julia is how well it is engineered and hassle free. It is quite phenomenal the performance you get out of it, with so little effort. And for basic technical computing things, like random number generation, you don’t have to dig hard for a better library. The “batteries included” choices in the Julia‘s standard library are pretty good.

02/8/18

When Julia is faster than C

On e-day, I came across this cool tweet from Fermat’s library

So I spend a few minutes coding this into Julia

function euler(n)
    m=0
    for i=1:n
        the_sum=0.0
        while true
            m+=1
            the_sum+=rand()
            (the_sum>1.0) && break;
        end
    end
    m/n
end

Timing this on my machine, I got

julia> @time euler(1000000000)
 15.959913 seconds (5 allocations: 176 bytes)
2.718219862

Gave a little under 16 seconds.

Tried a c implementation

#include <stdio.h>      /* printf, NULL */
#include <stdlib.h>     /* srand, rand */
#include <time.h>       /* time */
 
double r2()
{
    return (double)rand() / (double)((unsigned)RAND_MAX + 1);
}
 
double euler(long int n)
{
    long int m=0;
    long int i;
    for(i=0; i<n; i++){
        double the_sum=0;
        while(1) {
            m++;
            the_sum+=r2();
            if(the_sum>1.0) break;
        }
    }
    return (double)m/(double)n;
}
 
 
int main ()
{
  printf ("Euler : %2.5f\n", euler(1000000000));
 
  return 0;
}

and compiling with either gcc

gcc  -Ofast euler.c

or clang

clang  -Ofast euler.c

gave a timing twice as long

$ time ./a.out 
Euler : 2.71829
 
real    0m36.213s
user    0m36.238s
sys 0m0.004s

For the curios, I am using this version of Julia

julia> versioninfo()
Julia Version 0.6.3-pre.0
Commit 93168a6 (2017-12-18 07:11 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, haswell)

Now one should not put too much emphasis on such micro benchmarks. However, I found this a very curious examples when a high level language like Julia could be twice as fast a c. The Julia language authors must be doing some amazing mojo.

01/27/18

On the 7th anniversary of…a noble dream

Give me a reason to hang on
Give the strength to continue
In this land full of pain

How can I continue
to walk among the lifeless…the cowards and the ill-bred
while knowing that the noble ones are dying…or dead

In this reign of fear, you bend…or disappear
Ozymandias now has pathetic pretender of silly sneer and cold command
His colossal wrecks stretch everywhere, further spreading despair

The servile and hungry rise in thunderous applause to Works of their new god
The smell of decay stifling, but the fouls reckon it ambrosia
Cowardice begetting dystopia

A reckoning is overdue….

01/11/18

Factorizations to get a very special number

Mathematical recreation is fun. I came across a nice Mind your Decisions video and thought it would be cool to use Julia to find that special number. Have a look at the video here before we continue

  1. using Primes
  2.  
  3. function getBigestFactors{T}(a::T,b::T)
  4.     c=T()
  5.     for k in keys(a)
  6.         c[k]=max(a[k],b[k]) 
  7.     end
  8.     for k in keys(b)
  9.         c[k]=max(a[k],b[k]) 
  10.     end
  11.     c
  12. end
  13.  
  14. superfactors=reduce(getBigestFactors,map(factor,BigInt.(1:100))) 
  15.  
  16. n=prodfactors(superfactors)

Julia can be amazingly expressive. In line 14 are getting all the prime factors of all the numbers from 1 to 100 using the map function. We are using the type BigInt since will be generating pretty large numbers. The standard Int will simply overflow. We then use reduce to get all the common factors for the numbers from 1 to 100.

If your are slight confused lets take a bit slower. First we generate a one dimensional array of BigInts from 1 to 100.

BigInt.(1:100)

This makes use of Julia‘s very powerful dot factorization syntax. We then map that array into another array of Primes.Factorization{BigInt} through the factor function in the Primes package. The Primes.Factorization type is subtype of Associative. I learned that from reading the implementation.

The cool thing about Associative types that the can be very conveniently accessed. Lets have a closer look.

a=factor(24)
println(a)

yields

Primes.Factorization(2=>3,3=>1)

That is \(2^3 3\)

You can then do the following

print(a[2])

yielding

3

This means that the number 24 has a 2 raised to the third power as a factor.

Or, be more adventurous and try

print(a[5])

which will print a

0

This makes perfect sense as \(5^0=1\) and 1 is always a factor of any number. We will always get 0 exponent for any number that is not a prime factor of a given number. The Primes.jl package authors have done great use of Julia‘s very powerful type system.

To solve this problem we need to look at all the prime factors for all the numbers from 1 to 100 find the highest exponent of the of each of those prime factors. To do that we implement the getBigestFactor function when does that for any two prime factorizations. We plug that into reduce and et voilà!

The superfactors in line 14 are

2^63^45^27^2111317192329313741434753596167717379838997

Finally we multiply them to get our special number at line 16

n=69720375229712477164533808935312303556800

Are we done yet?

As the saying goes “there is more than one way to skin a cat”. Some are more elegant than others.
Here is a more elegant and computationally more efficient way.

  1. Using Primes
  2. function getAllFactors{T}(n::T)
  3.     p_factors=filter(isprime,T.(1:n))
  4.     exponents=[floor(T,log(p,n)) for p in p_factors]
  5.     zip(p_factors,exponents) |> Dict |> Primes.Factorization
  6. end

Here we just get the prime numbers in the range from 1 to n (line 3). Using those primes we then get the highest exponents that will not yield a number outside the range (line 4). Finally, we package everything as a Primes.Factorization (line 5).

In Julia REPL we get

julia> getAllFactors(big"100")
2^63^45^27^2111317192329313741434753596167717379838997
 
julia> prodfactors(ans)
69720375229712477164533808935312303556800

At that point, you might ask, why stop at a 100. Julia is lightening fast and powerful. Just for fun we can do the same for numbers from 1 to 1000.

julia> getAllFactors(big"1000")
2^93^65^47^311^213^217^219^223^229^231^23741434753596167717379838997101103107109113127131137139149151157163167173179181191193197199211223227229233239241251257263269271277281283293307311313317331337347349353359367373379383389397401409419421431433439443449457461463467479487491499503509521523541547557563569571577587593599601607613617619631641643647653659661673677683691701709719727733739743751757761769773787797809811821823827829839853857859863877881883887907911919929937941947953967971977983991997
 
julia> prodfactors(ans)
7128865274665093053166384155714272920668358861885893040452001991154324087581111499476444151913871586911717817019575256512980264067621009251465871004305131072686268143200196609974862745937188343705015434452523739745298963145674982128236956232823794011068809262317708861979540791247754558049326475737829923352751796735248042463638051137034331214781746850878453485678021888075373249921995672056932029099390891687487672697950931603520000

Now we get super special number that is 443 digits long. Lets finish by a Julia one liner to get the number of digits

julia> getAllFactors(big"1000") |> prodfactors|> digits |>length
433
01/2/18

Visualizing the Inscribed Circle and Square Puzzle

Recently, I watched a cool mind your decsions video on an inscribed circle and rectangle puzzle. In the video they showed a diagram that was not scale. I wanted to get a sense of how these differently shaped areas will match.

There was a cool ratio between the outer and inner circle radii that is expressed as

\(\frac{R}{r}=\sqrt{\frac{\pi-2}{4-\pi}} \).

I used Compose.jl to rapidly do that.

using Compose
set_default_graphic_size(20cm, 20cm)
ϕ=sqrt((pi -2)/(4-pi))
R=10
r=R/ϕ
ctx=context(units=UnitBox(-10, -10, 20, 20))
composition = compose(ctx,
    (ctx, rectangle(-r/√2,-r/√2,r*2,r*2),fill("white")),
    (ctx,circle(0,0,r),fill("blue")),
    (ctx,circle(0,0,R),fill("white")),
    (ctx,rectangle(-10,-10,20,20),fill("red")))
composition |> SVG("inscribed.svg")