Monday, August 31, 2009

Rich support for multiple (versions of) OSes: autoconf?

After many years of working on multi-platform applications (and now also multi-platform platforms), I find it incredibly difficult to try to be rich on all of them, and not either drastically increase the cost of production (with regards to testing resources) or reduce overall quality (screwed up boundary cases, where the “boundary” is much, much closer than one’d like to customer uses). One of the longer term issues that especially affects the Mac platform for Silverlight is our mechanism for building the CoreCLR.

What we’re building today is highly related to Rotor v1 and Rotor v2. The Rotor project was the shared source release of the Common Language Infrastructure, aka SSCLI. Rotor v1 was fairly multi-platform because Microsoft really wanted to show that their new CLI was a realizable technology on alternative platforms. It was released under the shared source license so that academics could peer under the covers and that, despite a facile JIT-compiler and garbage collection mechanism, the system would work and also how it would work. Figuring they’d proved their point1, the Rotor update to v2 was hamstrung further — they needn’t have it work on all platforms. The nice framework they had built and managed to allow for building Rotor v1 on multiple platforms had suffered bitrot and would build on nothing but Windows.

Fast forward several years and past an internal project that never saw the light of day, and you get to the inception of the Silverlight project. They had a mechanism that would (mostly) build something for FreeBSD, and with some minor tweaks, would build for Mac OS X2. Furthermore, another (defunct) team had gone through the effort to port the commercial JIT-compiler for x86 and the commercial garbage collector to GNU, so they had most of what they needed to start working3. Several teammates and I worked on this for a couple of years before it was picked up by the Silverlight team for the 2.0 release. Over the course of that time, some effort went into not completely destroying the ability to build on other OSes. That said, the only shipping product that was using the Rotor project at all was the Mac version of the CoreCLR4. I am positive that not only that you could not build our CoreCLR on other platforms using our source, but that I, myself, included code and/or improper #ifdefs that would make it not work. Not by design, but simply by not having a product for that platform.

Mono/Moonlight is both a blessing and a curse in this regard. As much as I might have wanted a business proposition that would have put a Microsoft-written CoreCLR on more platforms than just MacOS, the environment was/is not ripe for such an idea. The great deal we have with the Mono project means we’re likely to get the platform on many, many more OSes than Microsoft proper would have been willing to fund. On the other hand, the “curse” side is that there really is no other platform than MacOS for the autoconf-based/multi-platform-aware/multiplatform-capable build system to build. No reason at all to have all this extra gook.

In fact, this gook gums up the works somewhat. We broke a whole bunch of original assumptions when we finally released the Mac OS CoreCLR. We went from the autoconf premise that you’re building on the machine that the built code is meant to run on. Instead, we wanted to run on systems 10.4 and up, independently of whether we were building on 10.4, 10.5, or even pre-releases of 10.65. Furthermore, we wanted to be warned of potential future issues on 10.5 and later. As this to the idea, before it was deprecated, that we might build x86 on ppc and vice versa, that there are build tools that we need to create to actually build the product, and then there’s the product itself. The former need to run on the current operating system (even if through some kind of emulation — e.g., if on x86_64, ppc (if Rosetta is installed) and x86 are valid build tool architectures) and the latter need to have the cross-OS-version behavior we want (i.e., not using any deprecated APIs for one of the later OSes, and selectively using new-OS APIs via dlsym or CFBundleFunctionPointerForName or weak-linking. If we had gotten it right (bwahaha), we’d’ve cached config.guess files for other architectures and made sure the built products would Actually Run™ on the platforms for which they were built6. As it stands now, we have this overly complicated system that, yes, allows us to actually use 10.5 compilers to build the stack canary support into the applications we use on 10.4 (and not when we're building using the 10.4 compilers, which is purely an internal testing mechanism), but it also means we pass all sorts of extra grik around:
  • Mac OS SDK
  • Min ver (currently 10.4)
  • TBD: Max ver (as max as we can get it)
  • -arch flag for gcc (since autoconf cannot guess this with any utility uniformly across 10.4/10.5/10.6)
Plus, we use this external mechanism to enforce these same things on our partners’ Xcode projects (not everyone made the decision to use the same build system for both Window & Mac, much less the NTBuild system that we inherited) — we invoke xcodebuild with the specific SDK and various other #defines we want.

However, even when we get this right, and we use the old APIs for older OSes and the new APIs selectively, i.e., on the OSes where they’re supported, there’s no default mechanism for demonstrating that we’re doing it right. Nothing to call out that we have these references to specific APIs that are deprecated, but only in OSes where we expect them to be usable. No handy mechanism to segregate out these uses so that they can be deprecated when we change our own internal requirements to support a newer OS. It’s all internal manual review. I suppose things move slowly in the OS world, but I’d prefer that we’d be able to qualify all of these call sites with the right metadata — use this on downlevel OSes and this on modern OSes and effectively remove it when it’s no longer necessary, with some kind of deprecated code not included warning to let us know, so we can remove it at our leisure.

Notwithstanding these cross-OS-version issues, there’s still the issue that the autoconf-centric mechanism gets stale by default. We regularly create or view new Xcode projects in newer versions of the Xcode toolsets just to see what flags get sent down to gcc/ld so we can emulate the new behavior in the configure.in scripts. There’s a fairly rational argument to be made that we’d require less intervention if we just used Xcode projects for everything. The counter-argument is that if we did, we probably would not have to be as introspective as we are about how projects are built to get them to build, or, more importantly, to Work™ — we’d blithely build and sort out the bodies later.

There’s still no real solution to having multiple config.guess files for the multiple OS versions your app supports and some summarizer that converts that into static code changes for the things that are determinable (like all versions of the OS support dlfcn.h functions), and into runtime code checks plus disparate behavior for things that came into existence with a particular OS version. At this point, is autoconf too unwieldy to keep? Should we move to a simpler mechanism and live with the stuck-to-this-OS-ness it implies? Hard to tell. Any change will require some work. The only question is, is it more work than we would otherwise need to do over the long haul.
--
1Actually, I don’t actually know what they were thinking; this is supposition on my part. It predates my joining the CLR team by a couple years.
2In fact, one of the reasons I got the job is that I complained to them saying that their Rotor v2 release had broken the Mac OS X build, and that I had some patches that would fix some of the damage. This put me in contact with one of cross-platform devs (from which I inherited most of the Mac responsibility ultimately) and the dev lead of the project, who was willing to give me contribute access, even though I was from another Dev organization. When they realized they needed a Mac guy, they talked to me. As it turns out, they might have benefited more from a Darwin/FreeBSD guy, since my knowledge was more old-skool Toolbox than a born-again *nix guy (since I had done *nix work in college when I wasn’t working on 68k code). At least at the time. No longer. Plus now they know more about CoreFoundation than they ever wanted to know.
3Sadly, they didn't have a commercial-quality JIT-compiler for the PowerPC, so they just recirculated the good ole FJIT. (The “F” stands for “fast”, and by “fast”, I mean JIT-compiler throughput, not JIT-compiled-code execution speed.) The results were pretty shoddy — it worked but the word “performant” wouldn’t be seen in the same room as it.
4The build system for Windows Desktop CLR maintained the rotor project for quite some time beyond its last release, in the event it would be shipped again. However, in the spirit of “testing just enough” (i.e., testing sparse points of the matrix), we stopped building the Window rotor project some time ago, presuming we were doing better testing by having a real Mac rotor project that we had to actually ship.
5Don’t even get me started on that you cannot code on Mac OS X to be OS SDK-independent without Major Hackery™. Regular changes to function prototypes break our warnings-as-errors compilations.
6In the days before PowerPC was removed from our list of shipping targets, we had the capacity to build the opposite architecture on the same machine to make sure you hadn’t hosed the other side due to your changes. Well, it was all fine and good, except that it would not run at all. It did a fine job of catching compiler warnings, but at the top of the world, the endien #defines were all wonky because the values for the current build machine had bled into the target architecture. In the event that PowerPC was ever made little endien (not just little endien mode, like the G4 and earlier), perhaps this might have worked.

Tuesday, August 04, 2009

Failure to clean

While considering what to do for MQ1, I ran across a presentation (video) that Peter Provost gave for NDC 2009. He has an awesome analogy between coding and cooking at 31m into the presentation:
I was very fortunate to work for a good chef, and one of the things that he taught me was to be constantly cleaning up as I was cooking… [If you are forced to have the big clean up at the end,] as you’re constantly piling on the scraps of food all over the counter, and the dirty knives and plates, it totally gets in your way. At the end you stop and say, “We can’t cook any more food for an hour; we got to clean up the kitchen.”
If you can imagine that a version six product (where a product cycle is about two years long) is like a kitchen where people have been constantly working in for twelve years, and at no time has anyone actually cleaned it up all the way, and finding clean space to work in and clean tools to do the work with is getting progressively harder and harder, then you have a very good idea of what writing software2 is like.
--
1MQ is Microsoft parlance (perhaps others too) for a “quality” milestone. Milestones is one way to divide up a large set of tasks that you want to work on during the course of a single product cycle, and the ones which add features are generally M1, M2, etc. MQ (or M0) generally starts (or is in the interstice between) product cycles, and generally focuses on infrastructure and code-quality improvements—things that only have a secondary impact on whether customers will want our software, i.e., that we are efficient in making those other changes.
2Well, writing software in a non-agile or limited-agility way.