Friday, September 19, 2008

CoreText - Getting Glyph Names

Looking at the published file and its log I can see that there is a problem with the character codes that Cello is publishing. The published file format expects character codes that are then mapped to glyph indexes. Cello is feeding the code glyph indexes, it works but it is not right and will cause problems in other places where text is used. If you are not too familiar with the difference - Wikipedia has an excellent introduction here.

Getting glyph codes as a result of line layout is a natural consequence of using Core Text. Core Text takes styled text and turns it into glyph runs (lists of glyph codes) - this is exactly what I want. Core Text does all the clever bits of positioning, ligatures, alternate character sets etc. So the problem that I face is the need to convert glyph codes back to character codes.

At this point it is worth mentioning that it is not always possible to convert glyph codes back to character codes. Line layout will do things like convert some some multiple character forms to ligatures - see. Also some fonts can have different forms - for example "end swashes" where a character at the end of a line can be represented with different glyphs, and it goes on. The best you can hope for is to be able to convert most glyph codes to character codes.

I have been thinking about how to approach the problem. The best solution that I can see at the moment is to
  1. Pull the name of the each particular glyph - Core Text does not support this directly but it is possible to convert a CTFontRef to a CGFontRef using CTFontCopyGraphicsFont, and then getting the name using CGFontCopyGlyphNameForGlyph.
  2. Using Adobe's magic technique map each glyph onto a character code as described here.

No comments: