My Text Corpus in 2017

Page content

I’ve long used nvALT and Dropbox to maintain my collection of notes on my Mac. The benefit of this system is that so many iOS applications sync with Dropbox. I can search and edit my large collection of notes almost anywhere that I’m sitting. But 2016 brought some new innovations and concerns. The tremendous improvements with DEVONthink on iOS and the secure end to end encryption with their Mac application have made it a compelling capture and reference manager.

Along with my “concerns” about Dropbox I’ve been disappointed with their attention in 2016. I worry about their commitment to their core competency, which is document syncing. I’m also concerned about the ever expanding interest in and power to access personal documents on-line. I really don’t feel comfortable putting all of my plain text documents in Dropbox unencrypted.

I moved my entire collection of notes to DEVONthink Pro Office and DEVONthink To Go for iOS. I’m still reliant on Dropbox (and CloudMe) for syncing but the content is encrypted before it makes it to their servers. If Dropbox dies a painful death, I’m hopeful that DEVONtech would quickly move syncing to another provider like Apple or Box.com.

While moving my files, I also took the opportunity to think through my primary needs and do a little house cleaning along the way.

Here are my top level requirements for taking notes:

  1. A search should find exactly what I need within a few seconds. It doesn’t need to return the single note I need but it should have enough intelligence to prioritize better matches and get me to what I want quickly.
  2. Creating a new note should be dead simple and fast, fast, fast.
  3. The document should be stored as portable plain text whenever possible. HTML does not count as plain text, even though it is technically just text. Don’t be pedantic.
  4. I should be able to store more than just plain text if I want to.

Those 4 criteria are pretty easy to satisfy with DEVONthink for Mac and iOS. It’s not all puppies and ice cream though. Here’s what I lose:

  1. I can’t use a bunch of different text editors to search and edit on iOS or Mac. I could use referenced files on the Mac but I don’t want to mess with maintaining referenced and database documents.
  2. Search on iOS doesn’t highlight the matches when I search like my beloved Editorial app does.
  3. I can’t use the Dropbox recovery features if I blow up my collection of notes. I suspect the fact that I don’t have multiple apps syncing and editing my notes will mean I have a reduced need for disaster recovery.
  4. Apps like Editorial and nvALT are just so much faster for creating new notes.

I can still create notes very quickly on iOS using the Drafts app. On the Mac, the DEVONthink “Sorter” sits on the side of my screen so I can quickly grab a note anytime.

Mac Scratch

I think there’s more payoff in DEVONthink that offsets the pain points. For example, DEVONthink showing me potentially related notes is a massive bonus.

related notes

DEVONthink also quickly figures out related keywords for me. Just selecting a keyword from the list shows me more files in my collection that might also be of interest.

Keywords

So, yes, there are trade-offs. There always are. Right now the loss of infinite mobility between apps is not as important to me as excellent search, tagging, and encryption.

Methodology

I take no credit for any part of this system. Most of these ideas came either from Merlin Mann or by way of him. He’s very smart and I don’t think I could thank him enough for sharing his ideas publicly. He’s a heck of a gentleman.

The other parts of my system are from trial and error. It’s not brain surgery.

Tags

Most tags are easily forgotten. I stick with just a small collection. I started out putting my tags in the file name because most applications didn’t search file content. Then I moved them into the file body because applications started to improve. Now I tag the files using native tags in DEVONthink.

The q Trick

This is one of Merlin’s big ideas that really resonated with me. By using a sequence of uncommon letters, I create an escalating priority for notes that works with search in almost any app. A file that has “qq” in it’s name is probably pretty important. A critical or commonly used file is tagged with “qqq” and so on. The “q” key doesn’t require changing to a secondary keyboard on iOS and it’s fast to type in a search box. The more q’s the more important the file. It works surprisingly well.

Differentiating Tags from Text

Maybe it’s obvious, but using generic words as tags didn’t work well at all. Performing a global search for “note” is not that helpful when I just want to find documents tagged as a “note.” I originally adopted another Merlin trick her and appended an “x” at the end of tag words, so “note” became “notex.” This worked moderately well, but I’ll be honest, it was hard to read. Too many words looked like other words. Also, bad OCR tends to create words with either an “X” or “Q” character where it shouldn’t be, which made for some confusing search results.

Then I used an “@” prefix on tag words. I used this methodology for many years. A search in nvALT differentiates “@note” from the word “note” very well. The Finder does this very well too.

Tag Search in Finder

Unfortunately, not every app works well with oddball character prefixes because they are treated as search operators. For example, DEVONthink ignores that “@” character altogether. I hate changing my methodology just to use one or two apps, but it’s either adapt or never use anything better than what I have now.

“Turns Out” that doubling the first letter of a tag word works well for me. The words are still very readable. They are differentiated from standard words in the document text. My “@politics” tag became “ppolitics” and my “@science” tag became “sscience.” Life went on.

Auto-complete of tags still works pretty well in DEVONthink all file types. This is pretty key for being consistent with tags. It avoids misspellings. It also really makes me think before I create an entirely new tag. If the tag doesn’t already exist after this long, I may not need it.

Tag suggestion

I try to use the same tags for text files as I do for web clippings. General tags should be generally applicable.

Tags in DTTG

Luckily I don’t have a tag for Llamas yet because “lllama” looks a little weird.

I still have a mess of tags on services like Pinboard because that’s a lot of legacy and there are no few good tools for updating existing tags. I think I prefer the mess to the time it would take to create consistency.

Tag Location

There’s a risk with using tags on the files instead of in the files. If DEVONthink and the macOS both stop supporting tags then I’m up a creek without a paddle. All of my tags will be lost. I think if that day ever comes, I can convert tags to either file comments or add them as in-line text inside the file. For now I like using tag field because I get extra benefits for filtering and search. Additionally, adding, deleting and changing tags do not change the file modification date and screw up result sorting.

Tag Ontology

A sufficiently complex system of tags is indistinguishable from a pile of crap.1 My tag nomenclature if pretty limited I try to only choose tags that can be combined to quickly narrow results. For example, “ccode” usually means a note contains some snippet of code while “ppython” means it’s in python. Searching just “ppython” would return notes that might be about the language while searching just “ccode” would return JavaScript, Ruby, and variety of other languages I’m also terrible at.

I mostly care about tags that combine with only one other word. I’m not trying to write a sentence. So “rrecipe” pairs nicely with “vvegetarian” or with “ssous_vide.” All three words can go together but I’m very unlikely to need to search for all three before I find the note I want. The main point is that I don’t over-think the tag combinations. I try to stick to simple high level concepts that combine well. A search for “vvegetarian” and “sscience” produces a very different result than “vvegetarian” and “ssous_vide”, as different as the results for “ssous_vide” and “sscience.”

I avoid spaces, hyphens, and general punctuation in tags. Instead, I use an underscore character. Hyphens in tag words (such as sous-vide) are replaced with an underscore. Spaces between words are replaced with underscores. Instead of a single tag of “business card” I use “bbusines_card” which is easily recognizable, doesn’t confuse search tools, and indicates that this is a single tag and not two different tags.

I use all lower case because mixed case can be treated differently and if I didn’t pick one standard I’d go crazy trying to remember what I should do.

I omit the periods in acronyms and initialisms. This is where my system is a bit ugly but functional. Instead of “AI” I use the tag “aai.” It’s not nearly as readable but it works great for search.

rref

This is a generic tag for indicating that a note is really a reference document. This means it contains some sort of factual information, such as my wife’s shoe size or a list of the names of my neighbors and their drink preferences.

nnote

A file with the “nnote” tag is usually some sort of meeting note. It may or may not contain a list of facts. If I take notes during a phone call or a conference then I tag the file with nnote. If I’m learning about a new topic and I write up a summary of the material I tag it with nnote. I have a lot of nnote documents.

llist

Files tagged with “llist” are, well, lists. This includes things like books I want to read or the bourbons I want to try. They are mostly done in the TaskPaper format.

aaction

The “aaction” is the freak of my system. It’s the only tag in my DEVONthink that has children. This tag means I’m supposed to do something with the document:

  • ttranscribe (transcribe this photo, scan, or audio file to a text document)
  • ttask (create a task for this note)
  • eemail (draft emails)

This covers almost everything I’d need to do with a file other than reference it. It also allows me to jump into DEVONthink, create a quick text note with information and then set a tag of “ttask” to return to later when I have more time. It’s not a substitute for my OmniFocus Inbox. These notes are generally full of information like server connection details or the names of people on a project team. The task would be something like creating a 1Password entry for the server or adding the names and contact details to a project charter file in my note collection.

The nice thing about nested tags in DEVONthink, is that applying the “ttranscribe” tag also applies the “aaction” tag. I have an OmniFocus task every weekend to process my “aaction” tag in DEVONthink.

action tags

Searching Tags

I can browse by tags in DEVONthink on both Mac or iOS. But often search is much faster. I keep a smart folder in DEVONthink for Mac for searching tags. I double click it to change the tag I want to find.

Tag Search on Mac

On iOS, I just use the magic syntax to search across all databases.

Tag Search iOS

I don’t have to limit a search to just tags. That’s the nice thing about the unique tag words. A global content search also finds the tagged files.

General Global Search

Organizing Files

As an nvALT die hard I put all of my text files in a single folder. I lived this way for many years. It works really well. But I found that I had two major categories of information: work and home. I almost never wanted to results combined between those two categories. So I keep a “professional” collection and a “home” collection.

My home notes are just a single folder of notes. With tags and global search I can generally find what I want within seconds.

My professional notes are divided into sub-categories of major projects. In my day job I have about a dozen different projects. I could use a project tag but I like the effect of having all related files together. It’s better for looking at the progression of a project. It also makes searching much quicker, which is often what I care about most when I’m in a meeting or on a call.

I could replace folders with tags in DEVONthink. All things being equal, I like the future proofing of folders over tags. I can drag an entire folder structure out of DEVONthink and still retain the organization.

I’ve found no need for sub-categorizing my “home” notes except for a few big areas.

  1. I have a collection of movie scripts and books I like to draw quotes from. These contaminate search more than other notes so they are in a separate sub-folder.
  2. Research for writing is sub-categorized so that I can take notes and collect web archives. I find no value in having these notes with my main collection.
  3. Business contacts go in a folder where I also collect scans of business cards. I don’t put every contact I have in my address book because that’s just crazy. I don’t want Siri accidentally calling some rando plumber.

Script Files

Editing Files

When I want to edit a file on my Mac, I either edit it right in place in DEVONthink or I use the CMD+Shift+O to open the file in Sublime Text because the app respects the default application on macOS.

On iOS I mostly do quick edits within DEVONthink, but that’s a cruel and unusual punishment for anything longer than a paragraph. For longer edits, I send the file to Textastic or to 1Writer. These apps are special because they can share back an actual file object, not just text. When a text file (.txt or .md) comes out of DEVONthink it get an integer version number appended to the file name. Sharing the same file back to DEVONthink using the “Import” sharing option replaces the original file with the edits. It’s a complete round trip and the file is updated with all of the changes. No copy and paste needed.

Edits from Textastic

Paper Notes

I still capture about half of my notes on paper. I mostly use a Field Notes notebook and carry a Field Notes wallet with a Space Pen. But I’ll write on literally anything. The back of a receipt is fine. Whatever is quick and low effort.

After I take a note on paper, I either transcribe at the end of the day or scan them with my Scanner Pro or my ScanSnap. Scans go into DEVONthink with a “ttranscribe” tag.

I scan all of my full Field Notes notebooks before I shred them. This is mostly for reference if I need to look back and find something I failed to convert to a text file. The scans are named with the approximate date range of the notebook. I also add Spotlight comments that cover some of the contents of the notes. Finally, I adjust the creation and modification dates to match the paper notebook.

Field Notes

Conclusion

My method of taking notes changes about as often as my taste in music, which is rarely. I’m pretty stuck in my ways because my system works. There was a big loss in productivity when I switched my tag system. Some of it was caused by “muscle-memory” of using certain tags. Some was caused by losing legacy tag information (or the time spent updating them). I’ve slowly gained back some of my time because search in DEVONthink is so much better than most other apps.

It’s still a big risk to put so much reliance on tags outside of the note content. I also really have no way to move them to Windows or Unix (when it’s finally the year of the unix on the desktop). I wish I had access to my notes from my work computer but can bridge the gap with an iOS device on my desk. If I were to go any other route, it would probably be back to a folder of notes in Dropbox. Luckily that’s just a drag and drop away from DEVONthink.


  1. Probably Ben Franklin. I don’t know. Facts don’t matter anymore. ↩︎