Here's the main testing program (pitimes.c): With -fno-builtin specified, the atan2(0, -1) version is fastest. In my tests, with built-ins, the 4 * atan(1) version is fastest on GCC 4.2, because it auto-folds the atan(1) into a constant. I've included it as a baseline to compare against the other versions. The inline assembly version is, in theory, the fastest option, though clearly not portable. The program below tests the various ways I know of. More specifically, I'm using ways that don't involve using #define constants like M_PI, or hard-coding the number in. I'm looking for the fastest way to obtain the value of π, as a personal challenge.
File to upload/store in database: