Managing PDFs in OS X

The below is from a post on secondfoundation.org, which kinda wigged me out.
I’ve just spent about an hour and a half going through the archives on DrunkenBlog and making PDFs of the posts that interest me. Even though it’s just going to dilute my iTunes library, I’m filling it up with various articles and lengthy pages from the web that would otherwise just sit forgotten on my “Bookmarks” menu.
First of all, it's really cool that someone would like something on the site enough that they'd want to keep it around, and even cooler that there might be multiple things... but the idea of those words being set in stone somewhere does kind of wig me out a bit.
After all, I do plan on revising some of my older stuff in 2008. And part of me wonders just what would make up his greatest hits -- I can imagine some of the ones that went boom are in there, but is the ham story? Deconstructing H.264/AVC?
Bygones. What really wigged me out was the idea that of using iTunes to manage his PDFs -- iTunes is a pretty specialized program, and using it to manage his PDFs can't exactly be ideal. Then again, what alternatives are out there really? The man makes a good point.
It's extremely easy to whip out tons of PDFs in OS X, and I have my own collection of web pages I've grabbed to read somewhere else and even e-books, some of which I paid a pretty penny for. Truth is, I don't make as much use of them as I can, as they aren't really searchable, at least not in a powerful way.
I'm aware of a few pricier apps that do general cataloguing of media files, but even these are pretty limited when it comes to PDFs and e-books, at first glance. If you have 500 PDFs and are looking for a quote or script or anything, it's just generally easier to google online than to try to find it locally.
On Windows the situation is much better with the introduction of things like Google Desktop Search, but right now on OS X iTunes might actually be the best bet for a normal user. If I'm off-base on this, feel free to fill me in.
The good news is that when Mac OS 10.4 ships, enterprising developers should be able to take advantage of Spotlight and PDFKit to whip out small apps that would make short work of the problem.
Hopefully a few are...
Comments (19)
Posted by: Frank at March 2, 2005 05:52 PM
Your post sounds as if you don't know DEVONthink, yet… ;)
Posted by: Andre Lapierre at March 2, 2005 06:06 PM
Third Street Software just released 'Sente'. It is advertised as 'iTunes for academic literature' i.e. PDFs. Student copy is 49.95$. Full-version is 99.95$
Andre
Posted by: list author at March 2, 2005 06:11 PM
DEVONthink looks kind of cool, but $40 to $75 just to manage my PDFs makes Tiger look pretty good.
Posted by: Chucky at March 2, 2005 06:21 PM
iTunes is a pretty specialized program, and using it to manage his PDFs can't exactly be ideal. Then again, what alternatives are out there really?
DEVONthink, DEVONthink, DEVONthink.
If you like accumulating an organizable and searchable library of text of any kind - PDF, TXT, RTF, or HTML - DEVONthink is a godsend.
Posted by: Magnes at March 2, 2005 06:25 PM
LOL I think you knocked that site offline... slash-blog. How many readers DO you have?
Magnes
Posted by: A N Other at March 2, 2005 06:53 PM
I'll second DevonThink. It'll save and index web pages as html, too. And it supports services. So I select some text, hit the services key commands, and everything's saved in DT - along with a link to the original.
Another tool I find useful is History Hound, for the cases where I want to find something I've read but can't remember where. Enter some search text and it lists the relevant pages I've viewed. Sure, I could Google, but that's likely to be overwhelming. History Hound indexes only pages I've seen.
Posted by: Jesse at March 2, 2005 07:34 PM
Seems like DEVONthink does do pretty much exactly what I had in mind when I made those posts, and iTunes is certainly less than ideal (having a bunch of web site authors show up in my "Artists" list when browsing doesn't do me any good, for one). I'm not sure I care enough to put out money for it yet, though. Maybe I'll take the opportunity with Tiger to finally get around to learning the Cocoa APIs and make it a project.
Incidentally, I just went on an all-around PDF-making binge when I posted about that, so I've got all kinds of articles from sites that I won't necessarily read, but they're good to have for the moments when I think "hey, I remember reading an interesting post about that on some site" and don't want to go searching through all my bookmarks and RSS feeds. Plus, I thought the ham story was pretty funny and I like to keep some things around that are good for a chuckle on a train ride.
Posted by: denny at March 3, 2005 12:49 AM
add another vote for devonthink. this is at the top of my list of favorite apps. fast and easy importing of pdf, rtf, plaintext, html files that are already on your desktop. just as impressive, ease of creating new files from websites via the services menu. i archive 2 years worth of blog posts in no time. underneath it all is a fantastic content search process that finds similar and related content.
for anyone (particularly writers and researchers) with a large library of text clippings, files, etc, this app is the best tool imaginable.
Posted by: Chucky at March 3, 2005 01:11 AM
Another tool I find useful is History Hound
Reason #7 why I use OmniWeb instead of Safari is the built-in full text search of web page history.
Posted by: NoPCZone at March 3, 2005 02:27 AM
Yet another vote for the great stuff from Devon. Why their stuff isn't included, at least in demo form, on PowerBooks is beyond me. A free alternative is the library function built-into Adobe Reader. Most people that use the free utility never take the time to discover and harness the power of the app.
Posted by: Paul Turnbull at March 3, 2005 07:57 AM
++DevonThink
I use it as an article dump. I bought it on it's ability to categorize items into multiple places and because it's very good at figuring out where things should go for you.
Posted by: Jon Henshaw at March 3, 2005 08:10 PM
I don't really see a need for any 3rd party tools when 10.4 comes out. Shouldn't Spotlight be "exactly" what you need to search PDFs? I guess it might be cool to have a 3rd party app that searches and manages only PDFs, but again, The whole point of Spotlight is to manage and find ALL of your data, and to find exactly what you're looking for. If of course Spotlight sucks, and still can't help you find what you're looking for through your mess of data files, then yes, a 3rd party PDF manager may be necessary.
Posted by: Chucky at March 3, 2005 08:41 PM
Shouldn't Spotlight be "exactly" what you need to search PDFs?
Given that Spotlight won't fulltext index long PDF's, I'd say the answer is no.
Posted by: L. Thomas Martin at March 4, 2005 12:52 AM
You wrote: "What really wigged me out was the idea ... of using iTunes to manage his PDFs . . . using it to manage his PDFs can't exactly be ideal. Then again, what alternatives are out there really? The man makes a good point."
In Adobe Reader 7, File > Digital Editions > My Digital Editions > Add File. It works just fine for any PDFs you have on your computer. Thumbnails or list view.
LTM
Posted by: Kim Gammelgård at March 4, 2005 08:42 PM
When I have a job that demands data mining, I just make pdf's of all the files and put them in a folder and let OSX index them. After that I simply use cmd-f and specify that folder to search anything in there and the search capabilities of Preview to find whatever I need within the files after the first search. I even use it to look up bus and train schedules now!
Say that I am in the city somewhere, I just look up a common address and check all the occurrences of busses that go there in the pdf-bus schedule I downloaded. It is simply great :-)
Posted by: Eric at March 8, 2005 04:45 PM
Will DevThink perform batch OCR on images scanned to PDF? I use Paperport 10.0 on my PC, which allows me to scan paper to PDFs and save them to any directory, and then Paperport automatically extracts the metadata and indexes the files. I hardly need a filing cabinet...
Posted by: Neal at March 9, 2005 11:02 AM
Quicksilver is a great way to find anything on a Mac, and it's free. It will change the way you use your machine, for the better.
It won't search inside pdf's, but if you give the pdf an appropriate title, its easy to find. Quicksilver
( http://quicksilver.blacktree.com/ ) will also search bookmarks and link you directly to the site without opening a browser, just type in the word you remember. . .
Another way of cataloging material is with Notebook
( http://www.circusponies.com/ )--you can send selected text to a page with OS X services and it becomes fully indexable.
Posted by: James Howison at November 4, 2005 11:21 AM
BibDesk, manages PDfs and academic metadata, uses SearchKit for full text search.








My God! Thats so cool! Since when did iTunes support PDFs? :-D