I'll admit, my primary motivation for posting this was a hope that jtauber would have something to say about it.
I had a similar thought for a start, but not for English. Ideally you'd find a word list with the most phonemic spelling you could find. That's one of the reasons that I thought of Yiddish, whose Romanization (YIVO) is very regular and reflective of its phonology. You'd have to come up with some mildly sophisticated rules for a lexer that crawls the words and builds syllables—mostly for building diphthongs and consonant clusters—but nothing too hairy.
My intuition has been that Yiddish is particularly dense along its phonotactics. Now studying Italian, I wonder if it is too.
I suppose you could also call it phonotactic density.
That is, for all the possible lemmas according to a language's phonotactics, how many of them actually exist in the language?
I have often wondered about what I'll call for lack of a better term 'phonotactic coverage'.