Author Archive

My blog is dead; long live my blog

December 2, 2011 Leave a comment

It’s been a long time since my last post, so at this point my blog is pretty much dead. I never was much of a blogger anyways, so I’m kind of surprised I posted so much last year. I started a new job this year, and sometimes things get busy and I disappear for a while… but the long and short of it is, my blog is dead 🙂

Looking back over my prior posts, I realized a few things that might be useful for other aspiring bloggers out there

  1. Think twice, three times before posting a blog entry to reddit/digg/etc. If you have a catchy title, you can pull down a few thousand hits per day easily, but you should really think “Is my post worth sharing? Or does it need some work before a few hundred/thousand people wander over?”
  2. Figure out “why am I blogging?” and stick to that reason. If you’re blogging because you’re bored, your posts will probably be boring. If you’re blogging because you’re genuinely curious about something, and you put in the time to make interesting posts… then you’ve got a great reason to blog.
  3. Blogging takes time. Especially for highly technical posts, or posts about a project/experiment/some code you wrote. You should expect to spend several hours writing, reviewing, and proofreading each post.
  4. Blogging to get hits is a crappy excuse for blogging. Blog because you’re passionate about something, heck with hits and link karma.

I don’t have much else to say right now, but while I’m typing, here are a couple of links for fun/profit:

  1. Bit hacks – best collection of bit hacks I’ve run across… need to know how to hack bits optimally? Read on…
  2. Jenkins CI – life without some kind of continuous integration is a pain. Everyone should always test code before hitting the ‘submit’ button, but reality != fiction, and CI makes your life easier. Check it out, Jenkins is pretty cool.


Categories: Uncategorized

Core Dumps Rock

April 17, 2011 3 comments

So it’s been a while since my last post, and I find myself thinking about core dumps lately.

I won’t be wowing you with my awesome knowledge or teaching obscure implementation details regarding core dumps… I guess my bottom line message is “core dumps rock.”

Some of the more hard-core guys like Linus don’t believe in debuggers (real men don’t need ’em, eh?) but for the rest of us mere mortals debuggers are vital tools. And while I agree with Linus’s reasoning against debuggers, especially in the kernel — I guess I’m a pragmatist at heart. Use whatever tool helps you get the job done…

Anyways, core dumps are awesome. Especially if any of the following are true:
– you’re chasing an intermittent bug that only happens once in N runs or doesn’t reproduce with a debugger attached
– your code is running on an embedded device and there’s no JTAG (hence no realtime debugging/debugger)
– you want to save a snapshot of the system/app/problem for debugging later

Core dumps (and what triggers them) vary from OS to OS, but they generally contain some/all of the following:
– the page tables/memory for the crashing process, or a full RAM dump on some embedded targets
– the processor registers, including MMU configuration and other hardware configuration/state info

This info can be loaded into a debugger later on to construct a backtrace, examine variables, and can go a long ways towards solving the crash at hand.

I’ve used core dumps several times recently and it’s been a real lifesaver. The exact mechanics of using core dumps are already documented ad nauseum elsewhere, so I won’t repeat that info here. Just wanted to put in a plug for core dumps… Because when you need ’em, they… Rock.


Categories: Uncategorized Tags: ,

If Today Was Your Last Day… To Code

February 6, 2011 3 comments

This post comes courtesy of Nickelback, one of my “top 20” bands. I’m listening to their song “If today was your last day” and thinking happy thoughts, so walk with me for a bit. Pretend it’s your last day to code, what would you do?

If today was your last day [to code], and tomorrow was too late…

Would you spend time in meetings discussing the next cool project? Sit through yet another status update meeting while a coworker reads bullets from a powerpoint?

Would you write new wiki pages / document all the arcane parts of the system that nobody else understands? Would you finally finish the interface documentation you keep putting off? Would you clean up your desk and blurt out epithets at random passers-by?

Would you spend an extra hour teaching a coworker or junior engineer how to write cleaner/better code? Would you spend your last day tutoring, mentoring, making sure those you leave behind get the last ounce of wisdom before you go?

Would you start a new project, cut your teeth on the euphoria of starting something new and creating something from the ground up? Or would you rush around trying to fight the last few bugs blocking your product from shipping next week, in the hopes that your efforts will make the product better for the customer?

Would you seek out the hardest, gnarliest, most beasty multi-threaded race condition and kick it to the curb? Would you grab the nearest flyswatter and quash a few bugs just for old times sake? Or would you coast through, take an extra long lunch, and try to enjoy the last few hours of your coding existence?

Would you rewrite that one butt ugly section of code you’ve been ignoring? You know, the one that makes your skin crawl and makes your eyes bleed. Would you rip it to pieces and give it another go?

Would you explore new tools, migrate to git, drop MySQL for PostGres, check into the new code profiler or finally use valgrind to kill memory leaks? Would you improve the validation for your project, check out cruise control or Hudson / Jenkins?

Would you write in C, assembly, python, perl, Ruby, Haskell, C#, Java, Erlang, Go, Lisp, or would you invent your own language just for fun?

Would you write a new compiler, OS, GUI, database, web browser, or iOS/Android app? Would you migrate up the stack to functional programming or get into research? Or would you learn more about hardware, get closer to the processor and learn more about transistors?

Summarizing the song… Live today like it’s your last day, and do what you really wanted to do all along.

“It’s never too late to shoot for the stars regardless of whoever you are. So do whatever it takes…”



Categories: Life Tags: , ,

Running Towards Something

January 21, 2011 Leave a comment

Warning: this post contains no code, no technological jargon, nothing interesting to the average geek.

Instead, I want to tell a semi-personal story which I hope will inspire others to greater heights. Failing in that, I’m sitting on a train and have nothing better to do… so here we are.

I had an interesting conversation with an acquaintance a few months back, he was talking about the importance of running towards something. His point was basically that too many people spend their careers (lives?) running away from the things they don’t want to do, when instead they should be running towards the things they want.

I know, he’s a flippin’ genius.

But in general I think he’s spot on… You tend to get what you focus on in life, so why not put your focus into figuring out what you actually want to do, and go do it? Instead of running away from bad decisions… Make good ones.

I know I’m belaboring the point, but it’s pretty awesome that something so simple can have such a profound impact on your life. Well, in this case, my life.

A few months ago, I started getting a little bored at work. Things were a bit slow, and I always get a bit moody when I’m not insanely busy or when I have too much time on my hands. I shouldn’t have been bored – I really loved my job, it was challenging and rewarding technically, socially, and many other ‘ly adverbs. I had carved out a niche where I was respected and treated well, and the team had some top notch people on it. We were working on some pretty revolutionary technology that could someday make big headlines and change computing in cool new ways.

But I found myself getting agitated. Well, maybe that’s not the right word. More like, unfulfilled. I’ve always had some vague general career goals, but never really put the whole game plan together. I just knew that someday, somehow, my goals would take me away from my current job and in search of something else.

So when my friend brought up the “run towards something” speech, it really hit home for me. I had a great job, was comfortable, my family was settled into a nice routine in a great family town, and everything was status quo. So I knew with 85% certainty that I should just put my head down, enjoy life, and take a chill pill. And above all, not give up what I had for the wrong reason… I shouldn’t run away from a great thing.

But what he said bothered me a lot too, because it made me realize that I didn’t have a concrete plan for my career. I honestly didn’t know what I would run towards, career-wise. I’m a low level, OS and drivers and filesystems type of guy who loves tinkering with Linux and wants to make an impact on the real world and real end users. But what do I actually want to DO with my career?

When I get to the end of my career and look back at what I’ve done, will the world be any different because I was here? And if not, will I be OK with that? What do I want to look back on and be proud of because I did it, and did it well?

While I think there are much more important things than work and technology — relationships, how you treat other people, your family, people you can teach, etc — these career-centric questions bothered me for weeks.

The good news is, I figured it out.

I have a plan for my career, I know what I want to run towards. I now have several goals I want to reach for, things I think will make an impact on the world in some small way.

My friend was right, this is awesome. It’s seriously an amazing feeling to finally know what I want to do. And in the end, maybe I won’t actually get it all done. Maybe I’ll fail, and fail big a few times. But at least I will have tried.

The tricky part about this knowledge is acting on your plans.

One of my first plans is taking me and my family on a wild sprint out of our comfort zone, to another city and another job… but I’m ok with that. I’ve taken a job that will require us to relocate, and it’s a bit weird because of a transitionary commute while we make the moving arrangements.

So as I’m sitting on a train watching the world go by, I find myself feeling super excited about running towards something I want. Tired, exhausted really, and a bit overwhelmed by all the new ideas/people/work and things to learn… But excited.

And I’m super excited about seeing my family at the end of the train ride… So technically that’s TWO things I’m running toward. 🙂

Anyways, this rambling story has gone on long enough. If I haven’t put you to sleep already, here’s wishing us all insane amounts of success.

Run towards something! You’ll be glad you did.


Categories: Life Tags: , ,

gcc optimization case study

January 4, 2011 3 comments

I’ve used gcc for years, with varying levels of optimization… then the other day, I got really bored and started experimenting to see if gcc optimization flags really matter at all.

Trivial Example

It’s probably not the best example I could’ve come up with, but I hammered out the following code as a tiny and trivial test case.

 * bar.c - trivial gcc optimization test case

int main(int argc, char ** argv)
	long long i, j, count;

	j = 1;
	count = (100*1000*1000LL);
	for(i = 0; i < count; i++)
		j = j << 1;
	printf("j = %lld\n", j);

The Makefile is pretty straightforward, nothing super interesting here:


	$(CC) $(CFLAGS) -o $(TARGET) $(OBJS)

	-rm -f $(OBJS) $(TARGET) *~

It’s pretty boring code that loops 100 million times doing a left shift. Ignoring the fact that the result is useless, I then ran gcc with a few different flags with the following results. Runtime is in seconds, and the asm was generated with objdump -S and I’m only showing the loop code below.

gcc flags runtime (avg of 5) loop disassembly
-O0 0.2314 sec 29: shlq -0x10(%rbp)
2d: addq $0x1,-0x8(%rbp)
32: mov -0x8(%rbp),%rax
36: cmp -0x18(%rbp),%rax
3a: jl 29
-O1 0.078 sec e: add %rsi,%rsi
11: add $0x1,%rax
15: cmp $0x5f5e100,%rax
1b: jne e
-O2 0.077 sec same loop
-O3 0.077 same loop, different return
-Os 0.072 7: inc %rax
a: add %rsi,%rsi
d: cmp $0x5f5e100,%rax
13: jne 8
-O2 -funroll-loops 0.014 sec 10: add $0x8,%rax
14: shl $0x8,%rsi
18: cmp $0x5f5e100,%rax
1e: jne 10

Runtime measured with “time ./foo” on a Core i7-920 quad-core CPU with 6 GB of DDR3 memory

Interestingly enough, the fastest gcc compile options for this useless code sample are -O2 and -f-unroll-loops. It is faster because it performs an 8-way unroll and therefore does approximately 1/8th the work. This works in this trivial example because it literally replaces 8 left-shift-by-one operations with a single shift-left by 8.

So that’s semi-interesting, but all I’ve proved so far is gcc does in fact optimize and it is indeed much faster when you optimize a trivial loop example.

Multi-threaded app

I was curious to see how this plays out on non-trivial programs, and I had some code at work that needed this kind of analysis anyways – so I gave it a whirl on my multi-threaded app. I can’t break out the source code or the disassembly, and I’m honestly not 100% sure why – but I saw some very odd results.

flags runtime (approx)
-O0 8.5 sec
-O2 12.5 sec
-O3 13 sec
-Os 7.9 sec
-Os -march=core2 7.7 sec

What I find really interesting is that with -O2 and -O3, the application runtime gets WORSE instead of better. My best guess for this is that with O2 and aggressive inlining, the code size blows up and it’s worse for cache hit rates. I haven’t investigated it and probably won’t take the time, but I found it rather fascinating to see such a change just from compiler flags

FIO benchmark

Anyone who has visited my blog before probably knows that I’m a big fan of Jens Axboe’s fio benchmark. I decided to record a similar result from using fio to benchmark /dev/ram0.

I’m using fio 1.44.2, compiled from source on my local Ubuntu 10.04 adm64 system.

$ sudo ./fio --bs=4k --direct=1 --filename=/dev/ram0 
--numjobs=4 --iodepth=8 --ioengine=libaio --group_reporting
--time_based --runtime=60 --rw=rand-read --name=rand-read

There’s no real rhyme or reason for the workload I chose (4k, 4 forks, iodepth=9, etc), I just wanted something with a few threads and some outstanding IO to have a good chance of getting high bandwidth.

flags bandwidth (MB/s) avg latency (us)
-O0 1208 1.04
-O1 1524 0.83
-O2 1645 13.95 OR 0.78, very odd
-O3 1676 0.76
-Os 1543 3.29
-O2 -funroll-loops 1667 0.77 OR 13.33, odd again


With three very different examples, I get three very different sets of results.

In my trivial, stupid for loop example, unrolling the loop made a lot of sense and using -O2 didn’t matter a whole lot.

In my multi-threaded app, most optimizations actually made the runtime worse, but size and architecture made a difference.

And with fio, the results are pretty much what you’d expect – higher levels of optimization make the benchmark faster.

What’s the conclusion?

Compiler flags matter, but like all optimization their usefulness and impact is highly workload and application dependent.

So… you just gotta try them all, see what is best for your project. I know, killer conclusion, but hey… I just tell it like I see it.

Thanks for reading, I’d love to hear your comments/feedback below


Categories: Uncategorized Tags: , ,

2010 in review

January 2, 2011 Leave a comment

The stats helper monkeys at mulled over how this blog did in 2010, and here’s a high level summary of its overall blog health:

Healthy blog!

The Blog-Health-o-Meter™ reads Wow.

Crunchy numbers

Featured image

The average container ship can carry about 4,500 containers. This blog was viewed about 17,000 times in 2010. If each view were a shipping container, your blog would have filled about 4 fully loaded ships.

In 2010, there were 22 new posts, not bad for the first year! There were 23 pictures uploaded, taking up a total of 3mb. That’s about 2 pictures per month.

The busiest day of the year was May 12th with 4,426 views. The most popular post that day was Three Things I Love About C.

Where did they come from?

The top referring sites in 2010 were,, Google Reader,, and

Some visitors came searching, mostly for hg vs git, sigtrap, git vs hg, i love c, and hg vs git 2010.

Attractions in 2010

These are the posts and pages that got the most views in 2010.


Three Things I Love About C May 2010


SVN to Git September 2010


True Zero-Copy with XIP vs PRAMFS November 2010


Cloning a clone: Hg vs Git May 2010


TortoiseGit – round 2, fight! April 2010
1 comment

Categories: Uncategorized

Sorting for Chumps

December 9, 2010 Leave a comment

A terse post about sorting, written for chumps by a chump

If you’ve never spent time on Urban Dictionary, you’re missing out on one of the finer things in life.

Chump. A sucka that tries to act cool, but is really a fool and tries to act tough, but really isn’t


I’m a chump who’s been out of University for a few years now, so I’ve long since forgotten anything that I don’t use in my daily job. Lately, I’ve been going back over some of the things I used to know and trying to re-learn basic fun things like sorting and runtime analysis.

I won’t bore you with the distinction between big-O and big-theta, but there’s a fun table on Wikipedia’s Big O notation page for the curious.

For chumps like me who appreciate how stunningly cool the math is, but just want real-world performance data… you’ve come to the right place.

Test methodology

As preparation for this blog post, I wrote a small sorting library and “integer sorting” program which you can find on github. I named it libsort just because I’m cool like that. The code isn’t the best I’ve ever written, but most of it was hacked out on iSSH so… there you go.

libsort currently implements the following algorithms:

  • bubblesort: O(n2), O(1) extra space
  • mergesort: O(n log n) runtime, O(n) extra space
  • quicksort: O(n log n) runtime, O(1) extra space

I started with bubblesort because it looked easy (remember, I’m a chump :)) and it’s also n2 so it should make for good comparison against faster algorithms.

Mergesort and Quicksort really weren’t that much harder to implement… I picked slightly optimized versions and coded them up based on Wikipedia’s pseudo-code. Man I love pseudo-code…

I also ran two versions of quicksort and mergesort, with some minor changes like malloc’ing once beforehand rather than having each recursive call malloc temporary space. You’ll see this show up in the graphs as quicksort_onemalloc below, etc.

Test machine

The system I ran the tests on looks like this

  • Intel Core i7-920 @ 2.67 GHz (quad-core with 2 threads/core = 8 virtual CPUs)
  • 6 GB of DDR3 memory (sorry, don’t know the bus speed)
  • Ubuntu 10.04 Desktop amd64
  • a regular old SATA HDD

The Results

You can download the spreadsheet I put together from the raw data. Or you can be a chump like me and just look at the graphs.

Result #1: bubblesort sucks.

Bubblesort does SO poorly once you get beyond 10,000 elements that I stopped measuring it. O(n2) really hurts… 100K elements took over 110 seconds. That’s really, really bad.

Result #2: quicksort did slightly better than mergesort, but not enough to really matter a whole lot for most purposes.

The graph above doesn’t really do justice for just how fast sorting is for small lists. Both of these sorting algorithms can sort over 1 MILLION integers in less than a quarter of a second. You really don’t see much difference until you get past 100 million, then my implementation of an in-place quicksort starts to really shine.

Looking at it in a log10-scale layout:

Looking at it from a log-scale shows pretty much linear increase in log-time, which sounds pretty good to this chump.

That’s all for now, please check out the libsort code on github and let me know if I’ve made any stupid mistakes, etc.

Thanks for reading – now let me know what you think!

Categories: Uncategorized Tags: , , ,