Friday, November 09, 2007

Filtering headings by regexes: the next feature I want from all word processors

I was messing with my Blood of Heroes scenes the other day, and I wanted to make a printout of just one character's scenes. In general I use Word for fiction writing, and the main feature that brings me back to it again and again is its Outline View feature. So long as you use Word's heading styles, you can expand and contract topics by their headings, and that gives a document the equivalent of automatic hyperlinks. It's a great way to see the structure of a document.

Since I had labelled all the scenes with heading names that included the name of the viewpoint character, it wasn't too hard to contract the outline to show just the headings, then delete all the headings I didn't need, leaving a document that just contained the desired scenes.

But I had also recently done an XSL project where I was able to filter out sections of a document based on applying a regular expression to the titles. This kind of thing is super easy to do with XML and XSL. I suppose I should see what Word is supporting in terms of XML these days. If Word XML wraps the text under a heading with a wrapper element of some kind, the way DocBook has [section][title] [/title][para] etc... .... then it would be easy to do.

Anyway, all this led to my brainstorm: there ought to be a way to display and print only those chunks of a document whose headings match a regular expression. This would be a pretty easy thing to implement...a variation on outline view.

I use Excel's built-in filtering all the time. There's nothing more handy than making an ad-hoc list and being able to filter it to show only items that match certain criteria. It's so useful I'd almost like to write a novel in Excel. In fact, it would probably be possible to take a DocBook xml document and convert it to Excel-compatible XML, load that into Excel, and there take advantage of Excel's filtering.

I don't really want to do all that, though. I want someone to add this to a word processor for me.

1 comment:

  1. I just added this to xmetal: Run update.bat, open the resource manager, click the Lists tab, on the Show dropdown, select Titles (section, chapter...). Put your regexp in the Filter field and hit return.