I have decided that all this glyph clustering business really is the way to make searchable/extractable PDFs. I’m still have urge-to-kill-RISING, however, whenever someone praises some application for its ability to do fancy, connected scripts, when said application cannot find the members of a latin font family and provide no control over latin typography. GIMP, Inkscape, Scribus, you name it, they are piles of crap, considering how long they have gone now without means to do the simplest operations of latin typography—GIMP and Inkscape in particular, for being unable to find and identify fonts correctly. Inkscape actually handles OpenType now, but without means of control, and, annoyingly, lately it has been doing fake italics without informing you that they are fake.
I honestly can't think of one free software GUI program out there that isn’t a pile of crap for latin typography. Only in the batch processors is there even hope; and here I’m making my own contribution, which unfortunately can only be made slowly and without documentation, on account of my disability.
In other matters: I tried incorporating Python support, but it was too messy. I’m considering integrating Icon 9.4 into the system; another strong possibility is ML-Lua, which has the advantage of being written in OCaml. Both are in the public domain, I believe. Icon is familiar to me, including much of the internals, and (except for its graphics support) is accidentally smart enough to handle UTF-8 encoding, even though it is by now one of the oldest programming languages a person actually encounters. (Just use string matching operations instead of character matching ones.)