Paul Berry (stereotype441) wrote,

ICFP: the big post.

Ok, here's the long version of my previous post about the ICFP programming contest. I put in as much information as I could remember and/or reconstruct. Everyone else who was involved or hung out with us in the process, I'd love it if you would fill in the gaps in my memory.

ICFP, if you don't know, stands for International Conference on Functional Programming. Functional programming is my favorite kind of programming.

I'll try to tell the story in layman's terms as much as possible. Warning: spoilers.

Before the problem was even unveiled, we had set up a Pibb channel to act as a sort of digital "whiteboard", and jes5199 had rented a computer from Amazon.com's Elastic Compute Cloud. It wasn't any faster than, say, my laptop, but for most of the contest we referred to it as the "supercomputer".

Friday

The problem was unveiled at 3am Friday morning (a reasonable time if you live in Utrecht). jes5199 pulled an all-nighter so he could read the problem as soon as it came out. boojum and j3h woke up early for the same reason. lindseykuper decided at the last minute to join us, and pulled an all-nighter. I got a good night's sleep and didn't wake up until 7. I started reading the problem statement about 1 minute after waking up.

It said that an alien life form, called Endo the Fuun, had crash landed on Earth, and needed his DNA modified in order to be able to survive in our environment. They gave us an algorithm we could use to simulate how Endo's DNA gets converted to RNA, and another algorithm we could use to simulate how Endo's RNA would get translated into Endo itself, by translating it into a 600x600 image. And, of course, they gave us a copy of Endo's DNA sequence, which was 7.5 million bases long. Unlike Earth DNA sequences, which are represented by the letters A, C, G, and T, Endo's DNA is sequence represented by the letters I, C, F, and P. Cute.

In case it isn't obvious, Endo's DNA was a computer program, and the two algorithms were programming languages, one for manipulating DNA/RNA sequences, and one for translating the RNA result into a picture. For handy reference, they showed us a picture of what Endo's DNA produces after you execute it.



Our task was to create a "prefix" DNA sequence, which if you attached it to the beginning of Endo's DNA, would produce this picture:



The problem statement even spelled out exactly how our entries would be judged: a computer program written by the contest organizers would run our DNA sequence, convert it to an image, and analyze the image to produce a "survival probability". It would mostly be based on counting how many pixels were correct, with a slight penalty if your DNA sequence was overly long.

I met the others at Fuel Cafe about 8:30 or 9. lindseykuper hung out with us and read the problem too, and was a good source of ideas, although I don't think she did any coding.

While I combed through the problem statement, j3h and boojum worked on a simulation of Endo's DNA-to-RNA translation process (in Haskell), and jes3199 wrote a Ruby program to translate the RNA into a picture. We quickly reached the point where we needed source control, so we set up a darcs repository on the supercomputer. After I finished reading the problem I started looking at their code.

At around 11:30 we moved to my house and holed up in the basement. The DNA-to-RNA program was slow going. The RNA-to-picture program was mostly done, but it was extremely slow, and we couldn't see how to speed it up without rewriting it in a different language. So jes5199 and I started rewriting it in C++.

By about 2, the DNA-to-RNA program was partly working, but it seemed very buggy--it just didn't seem to produce correct looking results. I switched over to look at that with j3h while boojum and jes5199 finished the RNA-to-picture code. I think lindseykuper skipped out at this point.

A lot of coding and debugging followed. jes3199 caught some sleep. keystricken brought us tea (herbal tea for me, bubble tea from them). In the evening we headed to Farpoint.

At around 8 or 9 pm, we had "finished" coding both the DNA-to-RNA and RNA-to-picture programs, and fixed most of the bugs. We were anxious to process DNA to RNA so that we could run it and start seeing pictures. Only problem: the DNA-to-RNA program was impossibly slow. So slow that there was no way it would finish executing before the contest was finished. And we hadn't yet begun to think about to write the DNA-modifying program that we would eventually have to submit.

A happy side effect of the slowness of the code was that we found a hint we would have otherwise missed. The first thing the Endo DNA does when you execute it is output drawing commands that produce this image:



And then it immediately erases it.

Clearly this was a sequence we were supposed to try applying to Endo's DNA. We tried applying it, but our code was so slow that we couldn't see its effect.

j3h thought of some Haskell library code that he thought would speed things up, and I agreed, so we set to work at it. It was a lot harder than I expected. By about midnight I thought I had finished converting the code over to the library, but it was no faster. I started feeling very despondent, and went around telling everyone else (who were all trying to doze off) that Haskell was just not as good a programming language as I thought, and that I felt like starting over in C++ for this part of the problem too.

Finally, anonamyst convinced me to go on a walk with her for about 15 minutes. When we returned, I looked back at the code and found several pieces of conversion I hadn't done yet. By about 1 am, I had finished the library conversion. The result still wasn't fast, but it was at least usable. It took about an hour to turn the DNA source into RNA. Unfortunately, when we tried to convert the RNA into an image it didn't work at all.

Sometime around 2 am I curled up to sleep on the couch. jes5199 had been asleep for quite some time, which was understandable because of his all-nighter. j3h and boojum were awake, but I didn't know what they were doing. Twice while I was drifting off, I thought of an idea, stumbled across the room to tell it to them, and then lay down on the couch again, without really completely waking up. They told me later that they were both good ideas, which makes me both astonished and proud.

Saturday

About 6am I woke up again, with Tina (one of the cats) asleep on top of my feet. They were extremely warm.

The others got up soon afterward, and j3h and boojum briefed me and jes5199 on what they had been doing the previous night. They had written code that searched through Endo's DNA for sequences of drawing commands. They found 764 "chunks" of drawing commands and executed them to see what they did. The longest of these chunks was particularly interesting. It generated this image:



At some point someone reread the problem statement and noticed this promising hint: "we noticed that something curious happens if the following prefix is used: IIPIFFCPICICIICPIICIPPPICIIC". We ran that sequence and got this image:



Clearly we had some bugs, and we suspected they were in our RNA-to-image code. They were. I found one bug, which caused the image to get more and more transparent as the drawing process went on, and boojum found the other, which turned the image on its side.

Now our images looked like this:



And



At some point we remembered the DNA sequence that we had first tried prepending to Endo's DNA (the sequence that Endo draws and then erases when you start executing its DNA), and tried running it again now that our code was fast. It produced the "field repair guide" image.

We headed for breakfast somewhere on Mississippi. We felt great: the contest was less than half over, and we had already managed to render the source image and find two hint images.

We had a lot of great ideas about where to go next, but I can't remember what they were. I should go find our notes from that breakfast (which are still at Farpoint) and transcribe them for for posterity.

Here's what we actually wound up doing, as far as I can remember. I don't really remember who did what.

We tried the sequences we found in the field repair guide. Some of them led us to still more images, which contained other sequences and text. The field repair guide turned out to be a manual with a dozen or so pages. Some of them were beautiful. Some contained data in code. Some of them were silly. Here's one of my favorites:



At some point I started executing some of Endo's DNA by hand, trying to understand how it worked. I found a pattern that seemed to be a kind of function call. Someone else (j3h, maybe?) noticed that occasionally Endo would output a piece of junk RNA (which would be discarded by the RNA-to-drawing transformation), and that these pieces of junk RNA seemed to mark the beginnings of functions. Someone produced a list of all the functions and how frequently they executed when running Endo's DNA sequence.

jes5199 started doing static analysis of Endo's DNA, and eventually produced a master list of all of Endo's functions, and what functions they called. (After the contest was over, I realized that this list was incomplete, for reasons that we wouldn't discover for quite some time).

I continued hand executing Endo's DNA (doing dynamic analysis, if you will), and started picking up some of the clever techniques the contest authors had come up with to use Endo's DNA-to-RNA rules as a programming language.

I don't remember what everyone else was doing. But several times during the day, someone would tell me and jes5199 that we should compare notes, because we were finding similar things. For a long time, we didn't really compare notes, and that was a mistake. I think part of why we didn't compare notes was that I kept picking up signals from jes5199 that he was not quite ready to show me his static analysis. In retrospect, I'd misread the situation: the reason he wasn't showing me his static analysis was that he'd already checked in code that could do the static analysis, and it would be quite easy to run it myself. Unfortunately, because I didn't realize this, I didn't wind up seeing the static analysis for a long time.

We had a late lunch/dinner (at Thai Noon I think) and headed back to Farpoint.

I began to become despondent again. Although we had made a lot of progress in understanding how Endo's DNA worked, it was clear that we had a long long way to go before we could write a program of any significant length in Endo's DNA language, and it seemed like we were going to have to write a pretty huge program in order to paint all the shapes that were different between the two images.

Then, shortly before 5pm, we had a breakthrough. Someone noticed that we had forgotten to follow up on one of the prefix sequences in the field repair guide. It was 28 letters long. We tried it and discovered that it lightened up the image, so that rather than having the nighttime color scheme of the original image, it had the daytime color scheme of the image we had to produce. This was a breakthrough for two reasons: (1) it took us from having no pixels correct to having lots of pixels correct, namely all the pixels in the background of the image. (2) It gave us the hint that maybe we wouldn't have to write a huge program to draw the target picture--maybe all the elements of the target picture were already programmed into Endo, and we just had to figure out how to activate them.

We submitted the 28 letter prefix as our first entry, at 5:04pm. In about 10 seconds, the result came back. Survival probability: 1.2719%. Looking at the scoreboard, we saw that dozens of teams had the exact same score as us, and the exact same prefix length. Clearly we had all found the field repair guide, found that sequence, and tried it. So we weren't winning. But we were in the standings.

In the evening pmb dropped by (he was visiting from Eugene to do some juggling at Geek Fair), and some of us went for a walk with him and talked about the problem. I remember at this point I was beginning to speculate about Endo's ABI, things like: how does a function call work? How does one function pass parameters to another function? How do functions find each other in the DNA? I told pmb a lot of my theories on the walk. Most of them turned out later to be wrong in the details, but on the right track.

We continued into the night doing further optimization, finding more hidden images, and following up on hints. anonamyst baked us some seitan pot pie. Eventually I looked at jes5199's static analysis, showing where in Endo's DNA all the functions were. At around 4am, boojum had figured out how to work the "adaptor activator", which was a convenient way to call one of Endo's functions before the rest of his DNA executed. I figured out a way to modify the adaptor activator so that as soon as the function call returned, it would nuke the rest of Endo's DNA, so that we could see the graphic that was produced by just calling that function.

I think it was about 4:30am at this point. Everyone else had gone to sleep. I realized that my modification to the adaptor activator was a major breakthrough. Now I could execute any one of Endo's functions in isolation and see what it did. Some of them took a long time to run so I didn't investigate them. Others didn't work because they needed parameters, and I didn't know what parameters to give them (or even how to give them parameters). But others just worked. I found the function for drawing the apple tree, and the function for drawing a single apple. I found the function for drawing Endo's left arm, his spaceship, and the cargo box. And this was the clincher: I found a function to draw Endo's cow mouth, a shape that only existed in the target image we were supposed to produce.

Suddenly the problem seemed easy: all we had to do was figure out what each of Endo's function's did, and patch them. We'd find the function that drew the original alien Endo, and the function that drew the final cow-shaped Endo, and we would just have to replace a call to the first function with a call to the second. It was going to work.

Sunday

The sun had come up at this point. At around 7am or so j3h was awakened by Ghost begging for food (Ghost was a stray cat that had wandered into the house a few days before, on the brink of starvation. Judging by his mewing, he seemed to be in a perpetual state of desperate hunger). j3h was astonished to see me still up. He was planning to feed Ghost and go back to bed, but as soon as he saw my progress he was wide awake.

j3h remembered one of the subtle differences between the source and target images: we were supposed to change the apple tree into a pear tree. I had already found the apple function and the apple tree function, and with his help, we found the pear function. Then we carefully crafted a DNA sequence that would patch Endo's DNA, replacing every apple with a pear.

Since our DNA-to-RNA translation was still so slow, we were not able to simulate the whole effect of the prefix. But we were able to simulate its effect on the apple tree in isolation. And sure enough, it transformed it into a pear tree. At 10:08 am we submitted the patch. In about ten seconds the result came back. Survival probability? 1.3131%. Our rank in the standings? It wouldn't tell us our exact rank, because we were in the top 20.

j3h and I began to do some soul searching. You see, all four of us were scheduled to go to an anniversary party Sunday afternoon and evening. The submission deadline for the contest was 3am Monday morning. Before the contest had begun, we had all agreed that we would just pretend the contest was over at noon Sunday and enjoy ourselves at the party. The only thing that could possibly keep us from the party was if we were winning.

Now, it was just after 10 am, two hours before our self-imposed stop time, 17 hours before the contest would officially end, and for all we knew, we might indeed be winning.

We had to keep going.

jes5199 and boojum got up soon afterward. We found the spot in Endo's DNA that was used to encode the first page of the field repair guide, and decided to splice our own code into it. I spent some of my time working on Haskell code so that we could generate patching sequences faster--that was very fruitful. I spent some time searching at random through the function list in jes5199's static analysis, trying to find more functions we might patch--that was slightly fruitful. I spent a lot of time trying to figure out how to temporarily turn off some of the most time-consuming parts of the image, so that we could test our changes faster--I completely failed in that endeavor.

Some of the others spent some time working on splicing into Endo's DNA a no-op function (a function that did nothing but return). The idea was that if we wanted to make something in the image disappear (like, for example, the λxx to the left of the windmill, or the spaceship), we could replace the call to whatever function drew that thing with a call to the no-op function. But they couldn't get the no-op function to work.

They also spent some time writing a program that would interactively execute Endo's DNA one step at a time, allowing you to examine what the DNA looked like at every stage of execution. In other words, it was a debugger for Endo DNA. That was a stroke of genius. It basically automated the dynamic analysis process I had begun doing the previous day.

By around 2:30pm, we had not come up with anything better to submit, and we had fallen in the standings. I can't remember where we fell to--around 40th, maybe? j3h, boojum, and I left for the party. jes5199 followed about 20 minutes later.

I still had not slept, and was in no mood for sleeping. I hung out with people in the park and juggled with conform for a while. I rode on the merry-go-round. I had a soda. People kept trying to engage me in conversation, but it didn't go so well--I kept staring off into the distance, half thinking, half dazed from lack of sleep.

j3h, jes5199, and boojum kept talking about the problem. I tried not to think about it, figuring that I needed a significant break. By around 4:30 or 5, they had come up with some more ideas and headed back to Farpoint. I stayed at the party.

I can't remember when dinner was served. 6pm maybe? While I was going back for seconds, the others showed back up. They were ebullient. boojum was jumping vigorously into the air. They still hadn't been able to write no-op, and they still needed to write it. But in some places, they were able to make up for not having a no-op by using the "pear" function. Instead of getting rid of the spaceship and the λxx, they changed them into pears. Since pears are so small, it was almost as good in terms of the number of correct pixels. They submitted that result and got a 2.0028% survival probability, which put us into the top 15.

Also, they realized that in order to find parts of the image we could replace, they needed to try running every one of Endo's functions to see what its output looked like. And since our DNA-to-RNA translator was still so slow, they needed to set up as many machines as possible to do it during the party. So jes5199 rented about 20 more Amazon "supercomputers" and set them all to running Endo's functions.

I finished dinner, did a little tango dancing, and listened to some lovely stories. At about 8pm, j3h, boojum, and I ditched the party and returned to the contest. jes5199 decided to stay for a while longer. At this point I had been up for 38 hours. We got on our bikes and rode back to Farpoint. During the bike ride I developed a splitting headache that did not go away for the rest of the contest.

The others explained to me that (1) the Endo DNA debugger was much improved, and could be used as a tool for figuring out how to write a no-op function, (2) nonetheless their attempts to write one had failed, and (3) they needed me. I was honored.

We got back to the house and dove right into the no-op problem. By some miracle my brain still seemed to be firing on nearly all cylinders. I knew I needed to understand the ABI in order to fix it, so I started by just brain-dumping everything I had thought about it. The others provided information they had learned while I was at the party. We watched some functions execute in the debugger. We cross-referenced with a page in the field repair guide that seemed to be obscurely trying to tell us what we wanted to know.

Slowly the remaining pieces fell into place. Endo's DNA was simulating a fairly conventional computer architecture, with scratch memory, a code segment, and a call stack. It even used fairly standard calling conventions (Pascal style IIRC, but I don't have my notes with me). Armed with this information, I wrote no-op, and it nearly worked. At some point jes5199 got home and started pouring through the supercomputer output, identifying more parts of the image we could try to change.

It was midnight. I had been up for 42 hours, and had 3 to go. I was beginning to ache all over, and my short term memory was completely gone. Every time I switched to a shell window to type a command, I immediately forgot what I was planning to type and had to spend about 20 seconds trying to remember it. But there was a bug in no-op, and I had to fix it. I found it and fixed it, and I was trying to commit it to the master source control repository when I had what boojum refers to as my "meltdown".

Our source control system, darcs, completely failed me. j3h and I had made conflicting changes to the same file, and I was trying to resolve the conflict. Each time I told darcs to commit the change, it refused to because of the conflict. Then I went in with the text editor to see the conflict, resolved it, saved, and told darcs to submit the patch again. And again, it refused, saying there was a conflict. Apparently I started yelling and pounding on my keyboard--I remember the yelling, but not the pounding. The contest was less than 3 hours from over, I had one of the keys to solving it, and my source control software was preventing me from sharing it with the others.

I managed to fix the source control problem in about 20 minutes and send the fix to the others. Then I calmed down again. It was about 12:30 am and I was losing coherence.

The next big task seemed to be to transform Endo from his alien form to cow form. We started examining the routine for drawing Endo, and discovered that one of the first things it was test a flag in Endo's DNA. If the flag was "P" (meaning true), it executed one block of code; if it was "F" (meaning false), it executed another. Maybe that's how Endo is supposed to turn into a cow, we thought: we just change this flag.

My last meaningful contribution was to help find where in Endo's DNA the flag was located. We found it, flipped the flag, and re-rendered Endo. This happened at about 1:48 am. What resulted was my favorite image of all:



I could feel the smile spreading across my face as I said "those bastards!" (here is a wikipedia ref if you don't get the joke).

A few minutes later, I took a turn for the worse. It was nearly 2am, and I had not slept for 44 hours. I was having incredible lapses in concentration. Even though there was only an hour left in the contest, I realized I would not make it. I closed my laptop and announced I was going to sleep. And I did.

I woke up at about 7 the next morning with Tina curled up next to me. j3h had posted our best entry in his Livejournal. Our entry had a survival probability of 3.0861%. It looked like this:



Apparently they turned Endo into a duck on purpose. None of us know where the duck in the trees came from.

Submissions we made

The last column is Utrecht local time--subtract 9 hours to get to pacific daylight time. So, for example, our last submission was made at 2:55am Monday the 23rd.

Risk Survival Chance Prefix length Message Submit time
3587389 0.0000% 3999 OK 2007-07-23 11:55:19.113691
1036119 3.0861% 3549 OK 2007-07-23 11:43:35.658635
3594624 0.0000% 3994 OK 2007-07-23 11:33:30.875791
1036316 3.0820% 3746 OK 2007-07-23 11:22:59.168592
1055469 2.7068% 2869 OK 2007-07-23 10:58:52.294476
1064431 2.5452% 3521 OK 2007-07-23 10:47:39.839214
1050716 2.7960% 3296 OK 2007-07-23 10:24:35.973925
Error Error Error Zip corrupted or no Zip. 2007-07-23 10:23:23.427417
3585323 0.0000% 1933 OK 2007-07-23 10:12:15.605605
1098629 2.0028% 1659 OK 2007-07-23 02:47:11.516359
3584940 0.0000% 1550 OK 2007-07-22 22:28:14.430673
1156403 1.3131% 1193 OK 2007-07-22 19:08:55.244399
1160705 1.2714% 75 OK 2007-07-22 08:44:27.694574
1160658 1.2719% 28 OK 2007-07-22 02:04:22.83462
3598660 0.0000% 0 OK 2007-07-19 14:59:38.477502


UPDATE: This entry has been repeatedly targeted by spammers, so I've turned on comment screening. You can still make a comment, but it won't appear until I approve it.
  • Post a new comment

    Error

    default userpic

    Your reply will be screened

  • 19 comments