2005-12-06 12:46

Storing Nuggets of Information

Latest Update: Added link to Doug’s ‘Digital Packrat’ article.

Introduction

The Situation

Some of my favourite non-fiction writers have an amazing ability to spout any number of fascinating and amazing facts about any subject that comes up. Bill Bryson, for example, or Chuck Palahniuk. At first, I just assumed they had utterly incredible memories, and just knew all this stuff. After a while, though, it occurred to me that they were actively finding this stuff, and storing away whatever they came across from day to day.

Researching stuff as and when needed is pretty easy now. Google will find facts about pretty much anything. There’s some skill to using it well, but it’s not difficult. Storing stuff away day to day, though, is still surprisingly challenging. If you want to be a good factual writer, it’s an important thing to work out.

Purpose

So why am I writing this? I want ideas. I’ll suggest a few of my own thoughts, but I’m interested in how other people have approached this. If nothing else, it’ll make a good article for our writing section. Oh, and getting things written down into an article can be a great way to help clarify your own thoughts.

Why It’s Difficult

It’s not something that sounds difficult to do, but I think there’s a few problems here.

Quick Storage

If it takes long for you to note something down, you won’t bother. If it takes too long to file that note away, you’ll never get around to it, and you’ll be left with just a stack of unsorted notes.

Quick Retrieval

When you come to write an article about eggs, you need to be able to find all the snips of info you’ve collected about eggs. And the quicker you can do that, the quicker you can write that article. If it takes too long, or too much effort, you’ll probably give up, and just use what you remember. The article won’t be as good, and your book (“I Really Like Eggs”) won’t sell very well.

Long Term Storage

You’re not storing stuff away to access next week. You might not need some of this stuff for years. That gives you two problems.

Safety

You need the notes to stay safe for some time. If you lose everything when your hard drive dies, or the paper notes go mouldy in a damp drawer, you’ve lost some important value.

Long-term Accessibility

Shouldn’t be a problem for paper, but if you store everything in computer files, you need to be able to access them when you need them. Will the file format you’re using be readable by the computer you’ll be using in ten or twenty years?

Solutions Writers Have Used

I don’t know how most writers approach this, though I’m interested to hear. A couple I do know about…

Chris Bidmead

I’m fairly sure it was Chris Bidmead who wrote about this a long time ago in PCW Magazine (UK computing magazine). He used plain text files, as they’re the only almost universal format. The filenames were just dates with a serial number appended. The first line of each file was in a specific format, with the source, title, and keywords in it. He then had scripts that could search all the files for anything matching specific sources, keywords, etc.

I remember him mentioning at the time that he’d kept the years as one digit, because the system was only designed with a ten-year lifespan in mind. Not sure what he did when he hit that limit. Could be easily enough expanded by using a folder for each decade, I suppose, or just renaming the files.

Chuck Palahniuk

I’ve just read Chuck’s ‘Stranger Than Fiction’, and he mentions his filing system in there – a wall of filing cabinets he stores everything in. It’s in the context of receipts, though, so I don’t know if this he uses something similar for notes too.

Bill Bryson

Bill is probably my favourite fact-based author, and I don’t know how he approaches this problem. He seems too convinced that his computer hates him, though, for it to seem likely that he entrusts it with storing all of his notes. I could be wrong. There’s a few hints in Notes from a Big Country – he mentions scribbled notes on bits of paper, and storing notes in a file labelled ‘Absent Mindedness’, so paper filed by subject matter would seem to be it.

My Thoughts on the Problems

Quick Storage

Most of the stuff I find these days that I might want to store probably comes at me through my computer, so that seems the most sensible place to put stuff for speed’s sake. Printing it out for paper storage would add time and effort. If most of your stuff comes at you printed, you might be pushed in the other direction by the extra time needed to scan or grab digital snaps of the items.

Time needed to store things depends on how it’s stored, and in what formats. Copy and paste should work pretty well to a text editor to store plain text. Lots of apps now can export HTML. Getting stuff into an application like Microsoft OneNote is pretty easy – just copy and paste. Custom-built databases for this sort of purpose might make things even easier.

Quick Retrieval

Full-text search is probably quite important here, in whatever solution you choose. I think it would be important also, though, to be able to do just a keyword search. So you’re not scanning through every article you’ve ever snipped that mentioned eggs, just the ones about eggs.

Long-term Storage

Safety

If you’re using paper, I guess your enemies here are fire and water. For computers, just be very sure of your backups. Enough good backups can protect against pretty much anything, as long as you keep copies off site.

Long-term Accessibility

This is a bit of a thorny point with computers. Your paper shouldn’t go out of date in ten years. Your computer files might. If I store everything in Microsoft OneNote, will the data still be accessible from whatever computer I’m using ten years from now? Will Microsoft still make OneNote? If they don’t, will the copy I have now still work under Windows MegaSplendid 2015, Ultimate Wicked Edition? I have no idea, and I have no way of finding out. Microsoft don’t know either. If OneNote doesn’t sell, they’ll stop making it. If it sells well, they’ll keep updating it, and make sure it upgrades the data with each new version. My data access is being governed by market forces.

PDF files seem pretty safe right now, but will they still be in twenty years? Plain text is pretty much certain to still work, and I doubt HTML will become unreadable any time soon, but there’s limits to what data we can put in to these formats.

If I wasn’t so anal about it, I’d probably just dump everything into OneNote and stop worrying. But I am, so I do worry. Reducing everything to text adds work, though, which damages our Quick Storage, and makes it difficult to store pictures.

Possible Solutions

OneNote

OneNote wouldn’t even be here, except that I use a tablet PC. OneNote goes well with tablets. At the touch of a button, a nice new window will pop up, sitting on top of whatever else you’re doing, for you to drag stuff to, or make a quick scribbled note with the tablet’s pen. That makes a big difference to your Quick Storage. You can search pretty quickly and easily too, though you can’t really limit it to keywords. The clever part is that you can scribble your notes in ink with the pen, and it can still search those notes. It reads your handwriting in the background, so it knows what the text says.

The problem, though, is the question of how long the data will last. If we’re looking for a solution that will last for ten or twenty years (or more), there’s no way of knowing if it will do the job. And if not, getting five years worth of data out of it all at once five years down the line could take some time. If there’s any chance of you switching platforms somewhere down the line – to Mac or Linux, say, then OneNote is unlikely to ever work on them – though if the much-rumoured Mac tablet ever actually happens, Microsoft may be tempted to port OneNote to there.

Plain Text Files Only

The safest option for the computing paranoid. Can be easily moved from system to system – no problem if you move to Mac or Linux, or whatever else should come along. Should certainly be readable in ten or twenty years.

You’re limited in what you can store – no pictures, no ink notes on the tablet, which does have an effect on how quickly you can store stuff.

Limited File Types

You could choose a limited set of file types you allow yourself to store, and store your data there. To have a consistent way of keeping them dated and tagged with keywords, you could just use a defined format for the filenames – say “YYYY-MM-DD_Note Title_keyword keyword keyword_Source of Note.ext” or something similar.

You can make this as flexible as you like, or as reliable as you like, depending on how many file types you allow.

Plain Text

Certainly should be allowed – simple and reliable, and readable on anything.

HTML

The success of the web means that HTML is readable by almost any computer, and should be reliably readable for plenty of years to come.

JPG

Jpeg images should be as safe as HTML.

Windows Journal

Relevant to tablet users – comes with tablet editions of Windows, so is pretty sure to be in the next version of Windows at least, but could be dropped at any time. Can export to HTML or TIFF images, but this would have to be done for each file, one at a time. Unlikely to be readable in anything other than Windows unless it catches on a lot more than it has so far.

Microsoft Word

Word is probably a settled enough application that its files will still be readable for a long time to come. Readable on Mac or Linux using OpenOffice, if platform-independence is important to you.

Purpose-designed Database

Although it’s not something I’ve really looked into, there are databases specifically designed for storing all these snippets of information for writers. The problem with these is that you would have to be pretty sure that the publisher will still be updating it for as long as you need it to work. I don’t know any of them that well. It also doesn’t seem to be such a unique thing to need to do as to require a special program writing.

Conclusions

I’m interested to hear how other people have tackled this same problem, but my own temptation is to stick to a few file formats with carefully defined filenames to make the searching easier. As a tablet user, though, the temptation of ink is strong. I think maybe OneNote and Journal are great for transitory stuff – to-do lists, making quick notes, planning and writing articles (I’m writing this in OneNote now) – but less useful for any sort of long-term storage.

I think I’ll probably go for lots of files, with just certain file types ‘allowed’ (though it’s my rule, so I can change it) and using a filename that includes keywords to make searching quick and easy.

Any ideas?

External Related Articles

27 Responses to “Storing Nuggets of Information”

  1. Mike Brown says:

    A few things come to mind:

    Other writers: From his essays, I twigged that Martin Gardner kept drawers of index cards, meticulously cross-indexed, with relevant articles or snippets from his reading paper-clipped to them. He’d draw on these when writing his books/essays.

    The New Yorker magazine also had a legendary cross-indexed 3×5 index card catalog of the magazine’s contents going back to the founding. Their insurance company identified the index cards as a risk, which led them to move to a database, and then to scan in the issues, and then to release the magazine’s contents on DVD (I’m getting them for Christmas). The 3×5 card system has now been abandoned. (Read this in a NY Times article and an interview on NPR.)

    Journalist James Fallows (who worked with Msft on the development of OneNote, I think, esp from a journlist perspective) is a computer buff from way back. He touted the use of old DOS programs like Grandview (outliner program to help him organize his stories) and Lotus Agenda (“a spreadsheet for words,” which had pretty amazing natural-language processing of text on the fly– Google on that and breathe in the nostalgia). He used Agenda to collect snippets of everything, create categories and views on the fly, and essentially keep track of his research and notes.

    Nowadays, he uses Brainstorm and Mindmanager, and who knows what all.

    The novelist Robertson Davis kept a writer’s notebook of ideas, characters, etc (near to my heart as a writer). He numbered each page, and each entry on a page got a letter. When it came time to write a novel, he noted that entries 9F, 10A, and 12B related to a single character, and he drew the threads together that way.

    I’ve also had (and have) the info-packrat disease, which fueled my purchase of Agenda, Infoselect, Ecco Pro, and god knows how many others.

    The computer columnist Jim Seymour wrote somewhere, and it made an impression on me, that there is information that likes to be structured — by chronology, by someone’s name, by the alphabet, by location, by function, by program name, whatever — and then there is loose info that you can’t define a container for YET, but that you can’t bear to lose. This has caused me sleepless nights and I debate its core usefulness to me, often.

    The 43Folders post on living inside a single text file inspired me to try again at home with Notetab (Windows text editor). It has a simple structuring facility it calls an outline, but which is simply a flat list of topic headings on the left, and the text on the right. I’ve found I prefer the flat headings to hierarchical; they remind me of keeping notes in my Palm Memo (ie, “Books/Loaned to,” “Books/Library,” etc). it’s also like spreading everything out on a table so I can scan it quickly; nothing is hidden underneath another topic; everything is on the surface.

    Lately, I’m trying to bookmark less often, save info less often, UNLESS I have a specific project in mind. In that case, I create the folders/structures to contain that info and the info naturally adheres to it.

    At work, I use a dead-simple program called Electric Notebook (http://lincoln.midcoast.com/~ian/notebook.html), a very personal (ie, idiosyncratic) program with few of the amentities of OneNote, except that it can sit open all day, I type stuff in as it occurs to me, with (I hope) the right keywords, and then I search on it as I need to. Which is never as often as I think. It’s an electronic logbook, basically. It’s based on just keeping stuff chronologically, but in a rough-and-ready fashion. I find that it’s dumbed-down enough to suit my simple needs very nicely. I find, though, that I use it at home less than I use Notetab.

    For structured info at work, I use an OpenOffice Writer document to simulate Word’s Document Map function (which is similar to Notetab’s outline function — is there a pattern??). This document is called “infoindex” and holds various Unix commands, checklists, timecard chargecodes, etc., that demand to be stored and used as reference, not stuff that’s part of the passing scene. Stuff I input into Notebook that’s worth remembering or referring back to more than once gets migrated to the infoindex.

    I find this two-pronged approach works well for me. Electric Notebook for unstructured info, Infoindex for structured info. And it’s a simple enough process that I can use it when I’m distracted or under the weather.

    I would also refer you to the c2.com wiki’s entires on LogBook (http://c2.com/cgi/wiki?LogBook) and ElectronicLogBook (http://c2.com/cgi/wiki?ElectronicLogBook).

    Sorry for the long post! But this is a big interest of mine.

    Mike

  2. Bill says:

    I’ve struggled with this for years. One of the first applications I bought for my Macintosh in 1985was FactFinder. It was a slick little database for storing and retrieving text notes.

    What served me well from 1988 to 2004 was a small, spiral-bound notebook made by DayTimer. The model number is 98160. You can see them here

    This little book had several distinct advantages.

    1. Spiral bind lays flat.

    2. Pages are numbered (68 pages per book).

    3. Small enough (3.5 x 6.5 inches) to fit in a pocket.

    4. Fit beautifully in my Filofax.

    I would write the date at the top of each page and then write all of my notes for that day. Some days ran to several pages. Everything went in there: phone numbers, notes from phone calls, ideas, quotes from articles or speeches or conversations. I usually filled up 2 or 3 books each year. One year I filled 5.

    I think I got the idea from one of Jerry Pournelle’ BYTE columns.

    http://www.byte.com/art/9601/sec13/art1.htm

    For a while I kept a simple database of the books’ contents. Wished I’d kept it up.

    (Edited by pigpogm – just removing a long URL that was breaking the page flow – replaced it with a link.)

  3. pigpogm says:

    Thanks for all the comments…

    Jim: XML is a step in the right direction, but I don’t know how much it helps practically – the application still has to be able to understand the specific markup used in the XML file, doesn’t it?. Certainly better than anything binary, though – at least the text should all be readable and unederstandable, and in some way parseable.

    Rob: Google desktop certainly can help with the searching, but doesn’t help with file formats becoming unsupported over the years. I ended up uninstalling it, as it seemed to be taking too many resources, but the searching was quite impressive.

    mc: Mail.app stores everything as text? Nice. I assume it uses one of the standard old Unix mail formats, then, which should remain fully supported for a long time, and at least searchable and readable as text pretty much indefinitely. Sounds like quite a nice setup you have there – the Newton still has many fans.

    Steve: I only say HTML slightly less than text, as the markup does change a bit over the years, but you’re probably right – old HTML 2 files should still work just fine in any modern browser, and they’re still just as searchable and readable in a text editor. I thought current versions of Word would still open pretty much any previous version of Word’s files, along with WordPerfect 5.1, WordStar, and the like, but I have to admit, I’ve not tried it. Reopening and saving in new versions would be a lot of work each time, but would keep things up to date. I’d have thought JPEG should remain pretty much safe, since the web is so heavily dependent on it – can’t see JPEGs becoming unsupported in the next ten or twenty years. Not too easy to search for text, though – next version of OneNote will do it. You make a good point about CDs and DVDs, though. I’d probably keep everything on hard disk, though, with CDs and DVDs for backups. We’re probably not talking about that much data, but if we’re including such things as digital camera snaps, it could certainly add up to a lot.

    All: Sorry for the formatting problems in the comments – it seems to have been eating line breaks since upgrading WordPress – grabbed a new version of the MarkDown plug-in, and that seems to have fixed things. Thanks again for the input.

  4. steve says:

    “Limited File Formats” a slight quibble.

    HTML should be as good as text, provided you keep the markup extremely simple. It’s only text after all.

    I personally prefer HTML over text as it is structured but simple. And as text, it is searchable by command-level tools.

    jpeg? not nearly as durable as HTML since HTML is as good as text.

    MS Word? If you look back over the history of Word, you will find Microsoft changes the file format of .doc almost every version of Word. And backwards compatibility is lost every 2, 3 versions. This is extremely unlikely to endure.

    The only truly ‘safe’ way to keep documents readable, is to maintain your entire archive in current formats. Using MS Word as an example, then, you would upgrade Word/Office with each new version (at least every other version) and re-open all your old Word documents, saving them under the new format, and re-archiving.

    Ditto with jpegs and any other binary file formats.

    This approach has the added benefit of keeping your store of hard-drives, CDs or DVDs, relatively ‘fresh’ – though it does add a lot of effort to the whole process. For that reason, I’d say a yearly ‘archive refresh’ would be smart policy. Probably shouldn’t take more than a weekend?

  5. mc says:

    Thanks for the article. I have the original tablet, a Newton 2100 which compares most excellently with the travesty of an Acer tablet that my wife’s work gave her. (Sorry, don’t mean to start a war on that account, so ignore it, please.) The Newton files I have can easily be sent by mail, where they are stored in my Mail.app folders – text files, standards readable, etc. I do the “yyyymmdd title keyword” naming system, in addition. When not writing on the Newton, I use plain text where possible. All this gets merged together into searchable files, and it works great. Plain text archives, readable in any format, linked together with OS level search and archiving. Works for me…

  6. rob says:

    You could store it using any application then use Google desktop to find it.

  7. The Commonplace Book

    Over at the PigPog Blog is a great post about Storing Nuggets of Information, calling for ideas. This is something I’ve been struggling with for many years myself, and have only lately been making any sort of headway. When I think about all the years …

  8. Jim says:

    Good problem statement. One of the best tools I’ve come across for this is an app called Tinderbox. Unfortunately, it’s only available for the Mac, although a Win version has been promised – for over two years. Data is in XML format, which, while not a guarantee of future accessibility, at least increases the probability. Something to consider in whatever solution you select.

  9. Mike Brown says:

    Oh, and another cs.com link to Programmers Notebook, which includes a list of best practices: http://c2.com/cgi/wiki?ProgrammersNotebook

    meb

  10. pigpogm says:

    Wow, thanks, Mike. That’s a lot of information. It’s fascinating to know some of the history behind some of this stuff – I assume you (and most other people) have come from Douglas’ article on DIYPlanner, where he discusses Commonplace Books?

    Not seen Brainstorm. Mindmanager is a program that you can’t seem to help being repeatedly told about when you have a tablet PC, but the price is just way too much for me to consider. I’ve never tried Agenda, but it’s certainly a legendary app – Mitch Kapor’s work, wasn’t it, in the early days of Lotus? Vaguely heard of Infoselect, but don’t know much about it. Ecco Pro still has its fans, and it’s distributed free now, which always makes things more popular.

    The 43 Folders post you mention was an inspiring one. It did tempt me for a while, but plain text doesn’t really go well with a tablet. Admittedly, I was thinking then about lists of current information, which has very different requirements from reference material stored away in case it’s needed years from now. GTD lists and the like don’t need to be in a format that’s readable in ten years, as long as they’re readable for ten days or weeks.

    Thanks again for all the info. I’ll probably do a follow-up article soon to this, taking in some of what I’ve learned in the comments and from Doug.

  11. Mike Brown says:

    Hi Michael — Yes, I got to this topic from DIY, though your site has been on my radar for quite a while and I check in occasionally. I tried PigPog for a while, didn’t click with me, but it helped me think through how I personally process info and tasks, and I re-read the post from time to time.

    I’ve always been a fan of commonplace books, don’t know why. I keep a Word file that I dump them into, and at end of year I print it out and put it in a “Commonplace” folder; the folder also holds hard copy I come across that I want to preserve.

    See, information packrat :)

    I bought Brainstorm and have tried it a couple of times, but it also doesn’t click with me. I’ll probably try it again; I like trying out idiosyncratic programs made by developers at home. Notebox Disorganizer is another oddity; the UI is basically a spreadsheet grid, but each cell is a cubbyhole in which you can dump your information. The Editorium newsletter had a neat description of how he uses it; scroll down to “Resources.”

    Mindmaps are more fun to hand draw and noodle with, IMO, than the software-based ones. Too much cognitive overhead and time spent getting it just right on the puter, when a good-enough handdrawn one will help sort out your thinking.

    There’s also Evernote, if you’ve not tried it. It’s been getting some good buzz.

    Yes, Agenda was Kapor’s brainchild, and he’s now working on something called Chandler, supposedly another info mgmt tool. Agenda still has quite a loyal following.

    So much software, so little value from so much of it. I wonder if, in a world of less software meant to save time and improve my life, I’d have read more books.

    I think software is sometimes best used for a specific project or purpose, not as something to live in. That’s why I like the idea of the single text file approach — Google has taught us that categorization is not vital if you have full text search. And there’s little in my personal life that requires the full categorization that I need in my workplace.

    Still, I’m one of those people who like to file and make categories, so it comes naturally to me. I remember something I read a long time ago, that humans (esp computer people) tend to leap for the complicated solution first, thinking of all the exceptions that have to be trapped, and so on. In reality, a good-enough system will probably work and you only should handle exceptions as they arise.

    This is why I’ve drifted away from all-in-one software solutions, because I find I tend not to think of them as easy to use as a pencil or a text editor. (I daresay PigPog is an attempt to simplify GTD in the same way.) I also think that’s the value of the weekly review, to refresh those brain synapses about what’s out there. You can’t remember everything, but if you can remember where you put it, then that’s just as good. As the Extreme Programming guys say, do the simplest thing that could possibly work.

    You probably read/heard about the researcher who used DevonThink as his commonplace book/dumping ground for bits of text. He had an assistant type in lots of stuff and then Devon searched around and made unusual connections the writer would not have thought of. But the time cost of doing something like that is prohibitive to me. And as you say, what if the software never progresses (like Agenda or Ecco)?

    Sorry for another long post! I find this kind of discussion hits on things I’ve tried to figure out in my own life/work.

    All best — mike

  12. Gerard says:

    I have tried dozens of programs for info-keeping on the PC and settled on TexNotes from GemX. However, my “real” commonplace book is a Moleskine notebook. Yep….pen and paper….hard to beat.

  13. The Commonplace Book: Part I

    Over at the PigPog Blog is a great post about Storing Nuggets of Information, calling for ideas. This is something I’ve been struggling with for many years myself, and have only lately been making any sort of headway. When I think about all the years …

  14. Karl Vogel says:

    I’ve tried lots of ways to organize the stuff I store (mostly work-related). My current setup seems to solve more problems than it causes; if you’re interested, have a look at http://www.dnaco.net/~vogelke/Software/Time_Management/Work_Environment/

  15. Hal says:

    Freemind (http://sourceforge.net/projects/freemind/) is a OSS alternative to MindManager.

    Hal

  16. cainmark says:

    I use topic tagging, and the same tagging for my folder system for storage, both virtual and physical. What I finally settled on was the Library of Congress system, but smushed and shortened for my convenience.

    Example:

    Z-Lib

    Z-Lib_Cataloging

    Z-Lib_Typography_Fonts

    Z-Lib_Work

    The fonts category has a symbolic link to it from N-ART_NE_Print, since that would be the other place to find that kind of information in an academic library like the one I work in.

    Another thing that’s helped is having an AA Media and AA Programs folder/tag, that contains all the other tags. That way my media and programs are always seperate, and in my main organization scheme it takes less than three tries to find the information I stored.

    You can see an example of it here at my LibraryThing catalog. http://www.librarything.com/work/2418640&book=216333

    While I found it’s great for books, it’s also great for storing nuggets of information, as long as I remember to tag it with every tag I’ve created that I think is relevant.

    It’s been working wonderfully for me. It’s text based and folder based, so should work on any computer system. I use Ubuntu Debian GNU/Linux, which has a great finder program called “Beagle”, but I found that didn’t help me with my physical notes and files. The system I devised has been working great for me.

  17. Anonymous says:

    Massive thread on “General brainstorming for note-taking software” here: http://www.donationcoder.com/Forums/bb/index.php?PHPSESSID=688320de372d314004fa2956a741e3a4&topic=2362.0

  18. Chris Bidmead says:
    I remember him mentioning at the time that he’d kept the years as one digit, because the system was only designed with a ten-year lifespan in mind. Not sure what he did when he hit that limit.

    I recall this too, but slightly differently. It was the month, not the year, that I reduced to a single (hex) digit so that I could keep the filename prefix down to 8 chars, allowing me three leading chars for descriptive purposes, and reserving the three char suffix for defining filetype or some other parameter like magazine. Eg: SEX11A89.PCW. As you say, all these files were .txt files, at least initially, although later I mixed in .rtf files as well. Any “time limit” on this scheme would have been because I foresaw the eventual removal of the damnfool restriction to 8.3 chars per filename imposed by MSDOS.

    Incidentally the first line header took this form:

    [Title of the piece][magazine][date][author]

    … which allowed these fields to be parsed out by a variety of different text processing utils. This part of the scheme is something I still use today.

    – Chris

  19. pigpogm says:

    Thanks for the clarification, Chris. I do remember the first line being special information, which was pulled out and processed by a bit of Unix scripting, and I think the filename limit was just so you could move the whole thing to any other system without having to chop filenames down, so you limited it to DOS/ISO9660. The month was certainly hex, because picked up the habit for a while of using the same thing, which annoyed anyone attempting to read those dates ;)

    Shame I don’t still have the back issues to be able to check, but after keeping a couple of years of PCWs, the house foundations were starting to crumble, so we had to hire a low-loader and a crane, and have them removed. They don’t make magazines that thick any more.

    I vaguely remember, though, some mention of the limit being relatively short, but that you were confident that technology would catch up a bit by then.

    Anyway, ‘SEX’ and ‘PCW’? Did I miss an issue? Actually, you could never tell with Michael Hewitt’s column.

    Thanks for those great columns, though – it says a lot that they are still inspiring discussion all this time later. Now it feels like we’re having the world’s slowest conversation.

  20. [...] are comments I left on the high-fun personal blog PigPog. Back in 2005, Michael wrote a post on storing and retrieving nuggets of information. This invited a couple of unedited brain-dumps from your Humble [...]

  21. “To have a consistent way of keeping them dated and tagged with keywords, you could just use a defined format for the filenames – say “YYYY-MM-DD_Note Title_keyword keyword keyword_Source of Note.ext”

    I was just wondering, if there’s a special reason for this configuration for file names, or what are your reasons. Could explain it in more detail, which are your reasons, and if you currently are using this or any other schema. Regards.

  22. Kurt says:

    Ok – so this really got me thinking about long term storage and retrieval. The retrieval part I have down. But the long term part is something I have not considered as deeply as I should have until I read this post. I have tons of valuable information that I spent a long time accumulating for a book and hopefully multiple books. I use both onenote and Asksam, but I don’t want to be locked into a particular software or platform or file format for the obvious reasons. So the issues is this. If I move to a different platform or if Microsoft changes the onenote fileformat etc etc I don’t want to skip a beat. I want to have all this information at my finger tips that I spent a long time accumulating.

    I think that there is a solution, but before we get into that, let’s discuss topic tags. In my opinion topic tagging is a “must do” for for pin-point information retrival. But you have to do it the right way.

    Unique Topic Tagging:

    Topic tagging goes like this. If you have a chunk or “nugget” of inforomation – a sentence, quote, paragraph what ever – that you want to save for later retreival you topic tag it. Lets say you are reading a web page and you want to save a chunk information that falls under the topic “biology”. First you import it into Onenote (or Asksam or what ever) Then, as you read the web page you topic tag it at the important “nuggets”. And here’s the key. You make it a unique topic tag so that when you search you come up with the topic tag rather than the topic word. For example, if my topic is “biology” and I search the word “biology” I would have to wade through it piles of information for the nugget I was looking for. Total waste of time. But if you do a unique topic tag you go right to it. For example the “biology” topic may be tagged as bio> or something like that. Just as long as it’s not a real word, but is similar to the topic title so you can remember. Then when you search it will come up with each reference tagged as bio>. Also you can break it down into subcategories such as bio>gen> which is short for biology – genes and so on. Thus, after a week, a month, a decade or 50 years from now, when you want to write an essay or a book on biology and specifically genes, you can search bio>gene> you will have the tagged information. (Oh yea. Keep a list of your unique topic tags and subtags with you at all times in your Blackberry, palm or what ever.)

    Long Term Storage:

    Now. This get’s back to the original issue. Let’s say in 2 years I decide to jump the microsoft ship and go linux or mac. All that information in onenote becomes useless. Well maybe not. It would take some work – probably a day or two if you have lots of information – but you can do it. You can “send” the onenote information in a given section to a word file. (control+shift and highlight the tabs you want to copy; then file, send to word file). Then save the word file and open it in openoffice. Then I can save that file in ODF format which may or may not be a common standard by then. You still keep most of the pictures etc.

    The problem with this approach is that with all the pictures the file would get huge and unwieldy fast. So, you would have to save it into seperate files, perhaps under topics or groups of topics. You have your tags, so you should be in good shape doing a beagle search (or spotlight search in the Mac)

    Alternatively (and this is probably the way to go if you don’t need all the purrrdy pictures) and was mentioned earlier, is simply to save them as a text file and then paste the whole mash into your favorite text editor. You will still have what’s vital: the information and the topic tags. Plus text most text editors can handle huge files and search very fast.

    So the bottom line is this. Go ahead. Use onenote or what ever you use. Toss it all in there. Just make sure you have a way to get it into a text file. (Asksam can export directly to a text file) If you move to a new system it may take some time to move it all over, but it could be done and most important of all, you still have your unique topic tags, clutching onto those nuggets of information.

    Kurt

  23. Mike Shea says:

    This topic has interested me a lot. I recently built a website called the Memory Hole (www.memhole.com) that can capture bookmarks, web pages, and notes into a single huge ATOM file. ATOM is a common xml format for information syndication and it has all the right fields for storing random notes and nonsense. The site uses bookmarklets to easily save information into a personal database that can be searched, subscribed to by others, or downloaded locally.

  24. pigpogm says:

    The about page of your site explains the idea pretty neatly, and it does sound good.

    I have to admit, personally, I’d have a problem with using a web app for something like this, in the same way that I like my email and PIM to be local, not web-based. Definitely looks like my kind is going the way of the dinosaurs, though ;) And since you’re making everything available as a single Atom file, there’s no problem with exporting and keeping your own backups if you don’t want to trust the site owner (I hear he’s very trustworthy and reliable, though).

    If you want the Web 2.0 cred, though, you’ll have to add the AJAX you mention, and misspell the name. Maybe memry.com? (Damn, it’s taken.)

  25. Kurt says:

    Ok. Its been several months since I posted the first long term storage and retrieval of “nuggests”. I have made some adjustments to my program and and it works just great for me.

    Rather than using one note or asksam I simply cut and paste or type the information I need into a text editor and unique topic tag it. For example, a tag for philosphy is phil> If I have an idea – I tag with as a bin> (big idea note). If it’s a book I’m taking notes on – it gets a bib> (bibliography) tag. All of it goes into on big ass text file which I can search instantaneously. (about 30 meg so far)

    The down side is that I lose the pictures with web pages – but this is OK since I can save the web page as html in a seperate file and use the either copernic or msn desktop searchers if I need to retrieve. (again using unique topic tags in the name of the file – in this case I use zz instead of > because windows will not let > be in a name. Thus philzz stands for philosphy)

    My current text editor is VIM with the Cream overlay. But I found that Ultraedit worked quite well also. The key with the text editor is searching and the ability to handle very large files. It’s also great so that if I move to Linux in the not too distant future, I can use VIM and the same text file since VIM opertes on both windows and linux.

    One of the real nice things with this set-up is that I can open a couple of windows in VIM. My database in one window and my writing project in the other…. Imagine having years of research at the tip of your fingers, in a text file that will not be obsolete when the software changes and that I can search in seconds.

    For me this solves the problem of information storage and retrieval quite well.

    Kurt

  26. Hal says:

    Being on the road a lot I’ve struggled with this as well. I’ve recently designed a “micro-hipster” (being some preprinted business cards and a clip) and use my cell phone camera to capture an image of the card before I lose/toss it. To me the difficulty has always been moving from paper to digital. I HATE to re-key information.

  27. pigpogm says:

    I know what you mean, Hal. I never quite feel like something’s real until it’s ‘virtual’. A note on paper just doesn’t feel real – once it’s on the computer, though, I can actually do something with it.

    It probably comes down to where most of your ‘stuff’ comes from and goes to – most of my ‘stuff’ comes from the web, RSS feeds, emails, etc., and most of it goes to the web. Paper just doesn’t fit in well with that. The only thing I tend to do with paper at the moment is very brief notes, and brainstorming/thinking tasks.

Leave a Reply

 
 

See the bottom of the About Us page for our privacy policy.
13 queries. 1.378 seconds.