Of Smart Crash Reports
As it turns out, my embargo is generally worthless, because no one is paying it any heed. Earlier today, Rosyna passed on their new treat called Smart Crash Reports has reached Beta 5, along with some screenshots. "I'm embargo'd." I said, and then realized I was forwarding it onto twenty people I thought should know about it...
(Note: The download link on the front of the site isn't working, and most normal people -- Including the Unsanity Folks, are asleep. However, this direct link will work until non-vampires awake)
When an application crashes on OS X, starting with 10.3 the System catches it and then pulls up a little app. This window includes a region you can edit, and you're encouraged to write down any notes about what you think might have caused it, along with what you were doing before things went south. You then clicky the button, and it sends off your notes, along with the crash logs and a profile of your system, to Apple were it presumably does some good.
Now, most people have some expectation that they're actually seen by someone, somewhere, and taking the time to do this is going to help figure out what's going on. However, one of the things I've found out in talking to 40+ people about how Apple is approaching Quality Assurance in their software now is that this just isn't what's happening.
What basically happens is that these get fed into a database system, so statistics can be run and analyzed on where some major problems might be cropping up and the configurations that might be showing it. By and large, the notes and lengthy crash reports you might be filling out are generally never seen except in the case of a programmer somewhere who might be trying to track down a problem and can then search for reports that might match with what he's trying to track down and glean what he can.
The other problem is that there's an expectation that if Microsoft Word is constantly crashing when you wake from sleep, and you fill out the crash report, the Office team will hopefully see that it's been crashing on wake from sleep for the last version of the OS and now has what they need to hopefully track down the problem. The vast majority of users simply don't have the knowledge to find the crash log, let alone who they should send it to at Microsoft, so this would theoretically be a very cool thing.
Of course, 3rd parties just don't really have access to the information at all via normal channels, which means they're dependent upon Apple deciding to share it with them, which they generally don't. All of these software companies out there are generally flying blind outside of their beta testers, and since:
- General users have no clue how to go about this themselves
- Crash dumps are incredibly important to see what's causing a problem
...Many have just taken it upon themselves to build their own crash reporting solutions directly into their applications. You can see this in apps from the Mozilla group, and OmniGroup, and even small apps like AdiumX, and it's a godsend for them in tracking down what might be going wrong. They often will suppress Apple's own Crash Report in favor of their own, which is a little different from this solution.
Anywho, the problem for these guys is that someone saying "Well, I launched it and it crashed", or something like it, is meaningless to a developer if they don't have a crash report to go through to see where it's hanging up.
As an example, take the Safari Image of Doom. Let's say you had an app which ties into WebCore and/or WebKit, that was set to load some web pages when it started up. If it happened to contain something akin to the Safari Image of Doom, it'd have a high likelihood of crashing when it launched, which is the kiss of death to a 3rd party app's credibility.
However, most people wouldn't necessarily make the connection that it was crashing on something that Apple is shipping rather than in the developer's code, and there's no way for the developer to track it down without suitable crash logs being sent to them, and that's way over the majority of their user's heads.
Let's just say I'm not pulling the above example out of my ass. I've talked to/polled over 40 people so far on my pet project, and was astounded at what say, the people trying to use WebKit in a serious way are having to deal with. There's a vast difference between just loading an .html help file and actually having your app depend upon it.
As an example, in my informal poll of a few of the aggregators I kept hearing 85-90% of the bugs that cause their application to actually crash can be traced to WebKit/WebCore borking on what it's being fed in some way. I haven't talked to all of them, but enough that I know the number is reasonably accurate, and most of the range in percentages is due to the app authors tracking down the webcore/webkit bug and then writing extra code to work around it.
Don't think I'm bashing the people working on WebKit -- I have the highest respect for what they've done, given the task they're trying to accomplish with the resources they've been given, but there's no getting around that it's fragile and not where it needs to be.
To round back, third parties can only track down this stuff if they're able to figure out what's going on, which means they need that data, yet it all gets funneled to Apple and doesn't leave.
Slava @ Unsanity had the neat idea of making a way for developers to get their hands on that data as well as Apple. Basically, as a user, you install Smart Crash Reports on your system. When a crash occurs, if an application supports Smart Crash Reports, the user is able to fill out a report as normal, which then gets sent to Apple as normal, yet it's also sent to the application author.
Unsanity is known for their haxies, which often depend upon their code-injection framework known as APE. They've also done some pretty serious stuff just for the community of people at large: I.E., they were the first to find the horrific security flaw built into URL services, and released an app called Paranoid Android to help normal users fix the problem (for free) without having to resort to the terminal and such.
Personally, I think the people wigging at APE don't have a seriously solid understanding of how it works or some of its safeguards and realistic problems, but suffice to say, this app doesn't depend upon APE whatsoever, so it's a moot point. It's tying into input services, and not using APE. Get it through your head, because if you see some dork posting weird stuff about it on forums or something, it'll fall to you to educate the masses.
This is a brilliant idea, and a Good Thing, and they should be commended for it. While any developer can build in their own crash reporting system, if this gains support, they'll be saving a ton of third parties development time that can be then put towards their own applications.
About the best thing you can do is:
- Check it out, and put it through its paces
- Encourage your users to download and install it once you've knocked on the walls
I was really surprised by how easy it was to add support for this to your app. You simply add two keys (Obviously, the app needs to know if it should intercede in the process at all and who it should go to), and if Smart Crash Reports is installed, it picks them up the reports when your app crashes and makes sure you get the data either posted to a .cgi on a site somewhere or via email.
Getting it installed can be something else, but they have a little API in the SDK for installing it -- SmartCrashReportsInstall.
Check it out and install it, and if you happen to know a dev of some software you like, make sure they're aware of it, because they could be saving themselves a lot of work if they decide to roll their own, and helping to improve their software's quality.
Nothing looks any different when it comes to sending a crash, so chances are you won't even be aware it's there -- it's seamless and transparent.
Users are really the lynchpin of the system -- a developer needs you to have it installed in order for it to do any good, so it's a Good Thing to install it check it out. If you like it and its not doing something weird, leave it. It's a small thing to do, but over time could have an impact on the quality of things you use.
This is going to come up, because someone out there is going to put on the tinfoil hat and try to raise noise because they can. There are very few potential areas where this could be abused from any party, including Unsanity.
If you're a third party, about all you'd have to worry about would be crash reports also being sent somewhere else besides yourself, which would be stupid for two reasons:
- It isn't exactly top secret info, as it's already created and sitting there.
- I watched network traffic with it installed and sending off a report, and it's not doing anything untoward.
The latter was overkill, but the truth is that I've known the people behind Unsanity for awhile. They have my trust not to do stupid things when it comes to software, and to try to do the right thing when given a choice between two options. It's not a lengthy list, and I don't say it lightly.
I think opening the source should be looked at, not because of any lack of trust on my part about what they're doing, but because it could make it a lot easier for apps to distribute it and help get it installed on someone's system. I.E., it could help alleviate some of the chicken and the egg problem with getting this on people's machines, and I wouldn't have had to watch the network traffic just to say I did, and wouldn't have felt I had to vouch for them just to stave off the crazies.
Like most brilliant things, it's simple and obvious once it's out there and you wonder why it isn't already being done that way. The truth is, it should be being done this way already, and it should be something Apple is supplying instead of third parties finally deciding to just roll their own.
Unfortunately, there's little incentive beyond good developer relations and helping them improve the quality of software across the board, and unfortunately developer relations is often an oxymoron on the platform and can't be counted on. For Apple to roll this into OS X, they'll need to see it taking off outside of it.
I'm not being unduly harsh when I say that, as while there are people at Apple that go above and beyond because they get "Developers! Developers! Developers!", there's always been an undercurrent that Apple gives you the privilege of developing on the Mac while Microsoft is willing to do anything to get you to develop for Windows. Developers @ Apple generally get it, but management often doesn't, and there we have the problem.
Developers have wanted access to this information for a long while, and if they have to end-run while making sure Apple still gets what they want, so be it. They are still funneling the info into the black hole, they're just making sure it also gets sent somewhere where it can do the most good before it enters the Cupertino event horizon.
Mad love to Unsanity for stepping up.
[ Screenshots and Ancillaries ]
(Note: The download link on the front of the site isn't working, and most normal people -- Including the Unsanity Folks, are asleep. However, this direct link will work until non-vampires awake)
Comments (15)
Posted by: a macuser at September 8, 2005 10:10 PM
...They've also done some pretty serious stuff just for the community of people at large: I.E., they were the first to find the horrific security flaw built into URL services...
well - to be fair - they had a lot of help...
Posted by: Twist at September 8, 2005 10:13 PM
I love the crew from Unsanity. WindowShade X was the first piece of software I bought for Mac OS X (way back before APE and I haven't had to pay for an upgrade even though they have progressed from 1.0 to 4.0+) and I probably use it on an hourly basis. And ShapeShifter is the greatest thing since sliced bread IMHO. I have heard plenty of people bad mouth their software because it always shows up in the crash log and so they thing that caused the crash but these people are generally idiots so I don't even bother pointing them to the post about how to read crash logs.
Posted by: scotfl at September 8, 2005 10:39 PM
Don't think I'm bashing the people working on WebKit ... but there's no getting around that it's fragile and not where it needs to be.
The WebKit project is an open-source project, have you taken your concerns to the developers? You're definitely a vocal champion of the open source process, you don't seem to mention the fact that WebKit is being developed according to that process. It's a weird disconnect.
http://webkit.opendarwin.org/"
The 'image of doom' bugzilla entry:
http://bugzilla.opendarwin.org/show_bug.cgi?id=3340
Posted by: Colin Barrett at September 8, 2005 11:05 PM
Just a suggestion to the Unsanity people: Check out the Growl-withinstaller framework from the Growl project. Maybe you can get a similar thing working for Smart Crash Reports, so we can get this on as many boxes as possible.
Posted by: Colin Barrett at September 8, 2005 11:10 PM
I stand corrected (it's already there): http://www.unsanity.org/slava/smartcrashreports/SmartCrashReportsInstall/
Posted by: drunkenbatman at September 8, 2005 11:10 PM
Just a suggestion to the Unsanity people: Check out the Growl-withinstaller framework from the Growl project. Maybe you can get a similar thing working for Smart Crash Reports, so we can get this on as many boxes as possible.
I amended the post to include a link to that -- it's there, I just didn't have it in originally.
Posted by: Retard at September 9, 2005 03:12 AM
Hmm... the link on their site doesn't actually allow you to download anything. Bit strange...
Posted by: Retard at September 9, 2005 03:13 AM
Found it - the link on main page doesn't work, but the one found through screenshots... and following the link there does.
Posted by: Kool at September 9, 2005 04:48 AM
[...] and most normal people -- Including the Unsanity Folks, are asleep. [...]
So there are no normal people outside your timezone?! Thanks! :-P
Posted by: mikeash at September 9, 2005 05:09 AM
Why do people always jump to defend APE? The fact is, it uses the same technique to attach to processes as GDB, which does slightly change things.
As far as I'm aware, APE uses a mechanism similar to mach_inject (with added goodies to ensure a deterministic point in time for the code injection to take place during application startup), and gdb uses ptrace. The end result is similar, but the mechanisms aren't. the same.
Now normally this won't make a difference. But sometimes it's just enough to push an app over the edge and make it crash.
APE itself does very, very little. An app that crashes when APE is installed but runs fine when APE is not installed will also be very likely to crash the next time Apple releases a security update, or you install an AppleScript OSAX, or a contextual menu item.
You're right that it's possible in theory for APE itself to cause an app to crash. However, this is the app's fault, not APE's, and I've never even heard of such a thing happening with a shipping app.
APE modules, of course, are a whole different story. A bug or unfriendly feature of an APE module can easily bring down the host application, and this does happen. But APE itself is very much harmless.
(GDB itself can do weird things to an app, too - I've had apps that crash with GDB attached and don't without it, and I had a kernel driver that didn't work normally, but worked with GDB attached.)
These are relatively common, and they're called Heisenbugs. What happens is the code is accidentally relying on the contents of uninitialized memory or other tiny details like that, and gdb can move these things around. Note that this is a bug, and it's something that needs to be fixed in the app, even if it only appears with gdb attached.
Posted by: Bob at September 12, 2005 11:50 AM
As a total NON-Developer (total average joe user, re: Kevin Ballard Sept. 2nd post), why COULDN'T this be rolled into developer's application installers? It can check for a pre-existing version, installing itself if newer. Licensed out to developers, It could become ubiquitous, much as Allume Systems' (née Aladdin) Installer Vise of OS9 heritage. Us Average-Joe-Users wouldn't have to seek it out, hunt it down, wonder what we were downloading and installing. Our favorate apps (and by extension: admired developers) would vouch for it, provide it for us and educate us in its importance. Perhaps this is also an argument for open-sourcing it, relying on your generousity of spirit. No doubt there's a laundry list of issues I'm unfamiliar with in proposing this, but the thought of it struck me almost immediately, less than half-way through DrunkenBlog's article on it (which sent me here). Come to think of it, perhaps I'll post this there, too.
For what it's worth, I run OSX 10.4.2 on a 400 MHz Blue & White, where it crashes all the time (BRAND new install, no haxies, mods or alterations, clean and basic) and on a new 1.67 GHz 17" pBook, factory install, where it's stable, predictable, works-as-it-should (also no mods). For our B&W, I'd be sending out Crash Reports like mad (and did, to Apple all through May '05)!
Posted by: Bob at September 12, 2005 11:52 AM
Sorry, my above post was copied from Unsanity's website, where it references the Sept. 2nd post of another's. Oops.
Posted by: vastheman at September 13, 2005 08:49 PM
mikeash, no insult, but I've read that before. It just plain isn't true. APE isn't using the same technique as mach_inject et al, it's using the same system as GDB. And not all the crashes that happen when APE is installed or GDB is attached are caused by depending on uninitialised or freed memory being in a particular state. I'll grant that most of them probably are, but some of them are genuinely caused by the change in environment caused by having a debugger attached.
You can use electric fence debugging to make sure you're dealing with memory properly. It wastes addressable space, but buffer overruns, double frees, accessing freed memory and accessing unallocated memory will all cause the app to die with electric fence enabled.
Posted by: slava at September 15, 2005 05:05 AM
vastheman: um, what are you basing your facts on? =) APE isn't uses the same method GDB does to attach to the running processes... In fact, mach_* works pretty much similar to the way APE is doing it.
Either way, bugs happen, either in our ape module code or in the injected application code... And as I said before, we take full responcibility for it.
Either way, let's not turn this into an APE discussion, since it's not related to the topic of post at all. ;)








Why do people always jump to defend APE? The fact is, it uses the same technique to attach to processes as GDB, which does slightly change things. Now normally this won't make a difference. But sometimes it's just enough to push an app over the edge and make it crash.
(GDB itself can do weird things to an app, too - I've had apps that crash with GDB attached and don't without it, and I had a kernel driver that didn't work normally, but worked with GDB attached.)
People probably do blame APE for a lot of problems that are caused by other things, so I guess APE is copping it harder than it really should.
</rant>
Anyway, smart crash reports looks really cool! I'm checking it out now.