Summer adventure games

05 Jun 2017

It’s the first weekday of summer vacation, and we don’t quite have childcare worked out yet, so my poor kids are stuck in the office with me this afternoon.

Kayla trying not to crack a smile

Kayla failing to not crack a smile

A few things come to mind. First, thank goodness for the Switch. Kayla’s playing Breath of the Wild quietly. Second, yay, there’s a new Monument Valley out today! That helps a ton.

Mostly though, I think I need to load up that computer in the background with old Sierra games. One of my elementary school summers (I think it was between 4th and 5th grade?) I remember spending a bunch of time at my friend’s dad’s dental office playing King’s Quest 3 and the Colonel’s Bequest on their computer.

I need to find some good adventure games for the kids. Good ones that take hours and hours to play though. :D


Silly benchmarks

04 May 2017

It started with me being curious: if I use an Integer in Kotlin, am I going to be paying a penalty for using a boxed primitive? So I wrote some silly benchmarks to confirm. First, in Java:

public class SumTestPrimitive {
  public static void main(String[] args) {
    long sum = 0;
    for (long i = 1; i <= Integer.MAX_VALUE; i++) {
      sum += i;
    }
    
    System.out.println(sum);
  }
}

public class SumTestBoxed {
  public static void main(String[] args) {
    Long sum = 0L;
    for (Long i = 1L; i <= Integer.MAX_VALUE; i++) {
      sum += i;
    }
    
    System.out.println(sum);
  }
}

On my system (a mid-2011 MacBook Pro, 2.4 GHz i5) the primitive version takes 1.77 seconds, and the boxed version takes 20.8 seconds. Then in Kotlin:

fun main(args: Array<String>) {
	var sum = 0L
	var i = 0L
	while (i <= Integer.MAX_VALUE) {
		sum += i
		i += 1L
	}

	println(sum)
}

The Kotlin one takes 1.81 seconds. Tiny bit slower than the Java primitive one, but that’s probably just due to needing a little more time for Kotlin’s runtime to load. Kotlin does unboxed primitives properly, yay!

Now I’m curious though: how do the other languages I use on the regular perform? Let’s try Clojure first, both a straightforward implementation and one tailored to match the Java one better:

(defn sum-test-straightforward []
  (loop [sum 0
         i 0]
    (if (<= i Integer/MAX_VALUE)
      (recur (+ sum i) (+ i 1))
      sum)))

(defn ^long sum-test-gofast []
  (loop [sum 0
         i 0]
    (if (<= i Integer/MAX_VALUE)
      (recur (unchecked-add sum i) (unchecked-add i 1))
      sum)))

sum-test-straightforward took 5.1 seconds, and sum-test-gofast 1.69 seconds. The gofast one is comparable to the Java one, probably a little slower: I ran these at a REPL, so there’s no startup time involved.

Ok, how about Common Lisp? I can think of 3 approaches to take off the top of my head.

;; 2147483647 is the same value as Java's Integer/MAX_VALUE.

(defun sum-test-iterative ()
  (let ((sum 0))
    (dotimes (i 2147483647)      
      (setf sum (+ sum i)))
    sum))

(defun sum-test-recursive ()
  (labels ((sum-fn (sum i)
             (if (<= i 2147483647)
                 (sum-fn (+ sum i) (+ 1 i))
                 sum)))
    (sum-fn 0 0)))

(defun sum-test-loop ()
  (loop for i from 1 to 2147483647 sum i))

Using ClozureCL, all 3 of these perform abysmally:

SBCL does much better off the bat, but still not great:

Adding some type annotations and optimize flags helped SBCL, but ClozureCL’s times stayed the same:

(defun sum-test-iterative ()
  (declare (optimize speed (safety 0)))
  (let ((sum 0))
    (declare ((signed-byte 64) sum))
    (dotimes (i 2147483647)
      (setf sum (the (signed-byte 64) (+ sum i))))
    sum))

SBCL’s sum-test-iterative drops down to 3.13 seconds, still no allocation. No change on Clozure. I’m probably doing something wrong here, but it’s not clear to me what. The disassembly of sum-test-iterative on SBCL shows that there’s still an allocation going on there: maybe the problem is just that 64-bit integers don’t work unboxed due to SBCL’s pointer tagging?

* (disassemble 'sum-test-iterative)

; disassembly for SUM-TEST-ITERATIVE
; Size: 69 bytes. Origin: #x1002AF97D7
; 7D7:       31C9             XOR ECX, ECX                    ; no-arg-parsing entry point
; 7D9:       31C0             XOR EAX, EAX
; 7DB:       EB2D             JMP L3
; 7DD:       0F1F00           NOP
; 7E0: L0:   488BD1           MOV RDX, RCX
; 7E3:       48D1F9           SAR RCX, 1
; 7E6:       7304             JNB L1
; 7E8:       488B4AF9         MOV RCX, [RDX-7]
; 7EC: L1:   488BD0           MOV RDX, RAX
; 7EF:       48D1FA           SAR RDX, 1
; 7F2:       4801D1           ADD RCX, RDX
; 7F5:       48D1E1           SHL RCX, 1
; 7F8:       710C             JNO L2
; 7FA:       48D1D9           RCR RCX, 1
; 7FD:       41BB00070020     MOV R11D, #x20000700            ; ALLOC-SIGNED-BIGNUM-IN-RCX
; 803:       41FFD3           CALL R11
; 806: L2:   4883C002         ADD RAX, 2
; 80A: L3:   483B057FFFFFFF   CMP RAX, [RIP-129]              ; [#x1002AF9790] = FFFFFFFE
; 811:       7CCD             JL L0
; 813:       488BD1           MOV RDX, RCX
; 816:       488BE5           MOV RSP, RBP
; 819:       F8               CLC
; 81A:       5D               POP RBP
; 81B:       C3               RET
NIL

Next up, Swift:

var sum: UInt64 = 0
var count: UInt64 = 2147483647

for i: UInt64 in 0 ..< count {
    sum = sum + i
}

print(sum)

Without optimizations, 16 minutes 50 seconds. Holy shit.

With optimizations, 1.11 seconds.

Ok, last one, C:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
	long sum = 0;
	const long count = strtol(argv[1], NULL, 10);
	for (long i = 0; i <= count; i++) {
		sum += i;
	}
	printf("%ld", sum);
	return 0;
}

Why take in the count parameter from the command line? Because Clang cheats. If I use the constant in there, it’s smart enough to just precalculate the whole thing and just return the final result.

Without optimizations:

solace:sum-tests jfischer$ clang -o sum-test SumTest.c 
solace:sum-tests jfischer$ time ./sum-test 2147483647
2305843008139952128
real	0m8.247s
user	0m8.190s
sys	0m0.035s

With optimizations:

solace:sum-tests jfischer$ clang -Os -o sum-test SumTest.c
solace:sum-tests jfischer$ time ./sum-test 2147483647
2305843008139952128
real	0m0.006s
user	0m0.002s
sys	0m0.002s

It turns out, Clang still cheats even if the loop counter comes from outside. I’m pretty sure it recognized what I’m doing and just turned that loop into Gauss’ trick for computing an arithmetic series. It doesn’t matter what loop count I give it, it always takes the same amount of time with optimizations.

I can’t read/write assembly, but playing around on godbolt.org makes it look like that’s the case: https://godbolt.org/g/FmL66q. (There’s no loop in the disassembly.) And I can’t figure out how to trick it into not doing that, so I’ll call it quits for now.


Sneaking Clojure in - Part 2

23 Feb 2017

I ended up turning to a Clojure REPL to solve an issue in that project I totally didn’t sneak Clojure into before and realized I did some things the hard way last time.

First up: you don’t need to create and compile a Java class from Clojure to call into Clojure code from Java. If I had actually read the Java Interop reference guide on Clojure.org, I would have noticed that there’s a section on calling Clojure from Java. It’s much, much easier.

If I define this namespace/function:

(ns project.util)

(defn get-my-thing []
  {:my :thing})

I can call it like so:

// In Java code:

// First, find the require function. Then use it to load the project.util namespace
IFn require = Clojure.var("clojure.core", "require");
require.invoke(Clojure.read("project.util"));

// After project.util is loaded, we can look up the function and call it directly.
IFn getMyThing = Clojure.var("project.util", "get-my-thing");
getMyThing.invoke();

Easy peasy. I don’t have to jump through the gen-class hoops, and bonus! I don’t have to compile my Clojure code ahead of time. I just need to make sure the source files are on the class path.

You should of course compile your Clojure code if you’re distributing an application built on it. It’ll load faster, plus you might not want it readable.

What I specifically didn’t want to hook into that project that I totally wasn’t sneaking Clojure into is a REPL: I want to be able to poke directly at the application’s state while it’s running. To do that, I’ll need to make sure that tools.nrepl is available on the classpath, and require/launch it from within the application.

I could probably use Clojure 1.8’s socket server repl instead, but I plan on using Cider to talk to it, so nrepl’s a better choice.

In Java code:

public static void launchNrepl(int port) {
  try {
    IFn require = Clojure.var("clojure.core", "require");
    require.invoke(Clojure.read("clojure.tools.nrepl.server"));

    // Note: passing "::" as the :bind parameter makes this listen on all interfaces.
    // You might not want that.
    IFn startServer = Clojure.var("clojure.tools.nrepl.server", "start-server");
    startServer.invoke(Clojure.read(":bind"), "::", Clojure.read(":port"), port);
  }
  catch (Exception e) {
    // log the error
  }
}

In my theoretical project where I totally didn’t do this I also load in a namespace of helper code I’ve written to wrap around the Java objects we already have written.


Which of Apple's game technologies can I use?

08 Feb 2017

Writing this down here because I didn’t see an easy reference to it anywhere and I just took the time to figure it out.

I’m starting a new game-ish thing and figuring out what I want to build it in. What technologies can I use? In particular, I’m looking at:

Read more...


Sneaking Clojure into an older Java project

18 Jan 2017

Now I’m not saying I did this, but if I was going to do it and took notes for future reference, this is how I’d sneak some Clojure code into an older, Ant based Java project.

Read more...