Friday, May 30, 2008

Glyphs to Paths - Cubics & Quadratics in CoreText

Sometimes the world is a bit more upside down than you might initially perceive - Cello's output file format requires its curves to be described as Quadratics rather than Cubics. So I need to convert any Cubic paths that I get from CoreText to Quadratics.

Converting from Quadratics to Cubics is far less straight forward than the other way around. Cubics can only be approximated by Quadratics they can not be exactly converted. This sort of conversion is something that I have had to deal with in the past - it is not the sort of maths that I can pull off from first principles - so I have been working out with Google.

The best explanation of Cubics & Quadratic conversion that I have found is in the Font Forge (an open source font editor) website and can be found here. The section Converting Postscript to TrueType explains very clearly the issues that are involved in the conversion - check it out!

In doing the actual conversion after grubbing arround it would seem that I have a few options.
  1. Font Forge - As a font editor Font Forge does this conversion. Looking through the sources this happens in a file called SplineOrder2.c. Of the implementations of the conversion this is by far the biggest and I conjecture might be the one best suited to converting fonts. The license for the software is a modified BSD license (a very permissive license) so I could use it. Font Forge has its own representation for curves so my CGPath would need to be converted.
  2. Cello - Elsewhere Cello converts Cubics to Quadratics. The curves are in it's own internal path representation.
  3. GXLibrary - When QuickDraw GX was released the SDK had a library (supplied as source code) that includes a conversion from Cubics to Quadratics. The source can be found in CubicLibrary.c. The conversion is all based on GX data stuctures and fixed point maths.

Thursday, May 29, 2008

Glyphs to Paths - Quadratics & Cubics in CoreText

Pressing on with the conversion of the file that deals with glyphs I need to obtain the path that represents a glyph. CoreText provides the function CTFontCreatePathForGlyph that returns a CGPath of an individual glyph. The final thing to do then is to convert the path to the output format.

Core Graphics provides a function CGPathApply that will apply a user defined function to each element a part is composed of. So your user function is applied to al the moveto, lineto and curveto operations that make up the path.

To make getting to the path elements a little easier in C++ I have written a visitor class. The class is quite trivial, little more than a switch statement, that calls through to virtual moveto, lineto and curveto methods that can be overridden. Another way of doing this would be to write an iterator.

A slight complication is that the CGPath can contain both cubic and quadratic curves. In the target file format all curves are cubic. This means that any QuadCurveToPoint calls have to be converted to the cubic CurveToPoint. I am not yet in the position to test any of this - but I think it is reasonable to assume that there is a high probability that some font outlines may be expressed in quadratic curves - simply because quadratic curves are the native curve type of TrueType fonts. I would expect the Postscript (Type-1 fonts and friends) to appear as quadratics as that is the native curve type of these fonts.

Converting the QuadCurveToPoint so that it calls through to CurveToPoint is not too bad. The key thing is to have the previous end-point - and the rest is maths. I did not work out the maths that does this conversion myself - but it is something that I have had to use before. The function (or method) drops out quite simply. A link to some maths that seems strangely consistent with my old notes can be fond here.

void CGPathVisitor::QuadCurveToPoint(const CGPoint &q1, const CGPoint &q2)
{
CGPoint c0 = m_lastPt; // last point

CGPoint c1 = { c0.x + 2.0 * (q1.x - c0.x) / 3.0,
c0.y + 2.0 * (q1.y - c0.y) / 3.0 };
CGPoint c2 = { q2.x + 2.0 * (q1.x - q2.x) / 3.0,
q2.y + 2.0 * (q1.y - q2.y) / 3.0 };
CGPoint c3 = q2;

this->CurveToPoint(c1, c2, c3);
}
I have not tested this code yet - if there are problems with it I will come back and correct this post.

Friday, May 23, 2008

Kerning Tables and CoreText

I don't think I have ever worked on something type related without having, at some stage, to grub around in font tables. Cello, it turns out, is no different. The output file requires a subset of kerning pairs of the fonts used in the documents - to do this I need to get a set of all the kerning pairs that are defined a given font. Core Text won't give me the kerning pars but it has a function CTFontCopyTable that will give me, via a CFDataRef, a pointer to an arbitrary table in the font. All the fonts Core Text deals with are TrueType fonts, and those that are not (Type 1 postscript fonts) are cleverly packed up on the fly as synthetic true-type fonts. CTFontCopyTable takes an options flag that allows you to screen out synthetic tables - should you wish.

So to get the kerning table of a font you:
CFDataRef kerningTable =
::CTFontCopyTable(font, kCTFontTableKern, kCTFontTableOptionNoOptions);
const UInt8 *kerningTablePtr = ::CFDataGetBytePtr(kerningTable);
The format of the truetype fonts and font tables can be found here. Details of the kerning table can be found here.

There are four different formats of kerning tables. I plan initially to support Format0 only

Wednesday, May 21, 2008

Fonts and Font Scalars - from D-Type to CoreText

I have come to the part of Cello where I need to start dealing with text and fonts. Detailed information about glyphs, metrics, kerning pairs, glyph outlines etc. all need to be known in order to generate the final output file.

The Windows (MFC) version of Cello uses the D-Type font engine (scalar). Looking at D-Type it looks pretty good, and it is available for OSX. I do not, however, plan to use it - instead I will use CoreText. Core Text is Apple's new type technology. Released in Leopard (10.5) it replaces ATSUI which has entered the holding area that technologies enter before deprecation. I plan to use CoreText for a variety of reasons - the main one is that it integrates directly and naturally with CoreGraphics (Quartz) Apple's graphics technology. I plan to do all the drawing with CoreGraphics so using the complementary companion technology makes sense. I have experience of mashing and melding disperate graphics engines and font scalars and the like together and it is a whole lot easier to have a single imaging model.

The CoreText API looks clean and easy to understand. It is built on top of standard Core Foundation objects like CFDictionary as well as introducing it's own new types like CTFont. As with other Core Foundation technologies it toll-free-bridges onto equivalent cocoa NS types. This will make the inevitable protrusions into Cocoa for the UI straightforward to deal with.

Tuesday, May 20, 2008

typename in templates - gcc errors

Pressing on I am working through the generator files. These are a set of files that generate the output. The biggest problem I found so far in these is some template definitions that tripped up gcc which requires a typename to be specified in some circumstances where visual c++ was quite happy. The following class - when complied on XCode

template <class T>
class Foo
{
public:
  Foo(const T&);
  std::list<T*>::iterator Bar();
};
Results in error: expected ';' before 'Bar'

Adding typename fixes the problem - so the gcc friendy Bar looks like this:
  typename std::list<T*>::iterator Bar();
As with many compiler errors the error message gave me little insight into the problem. I tracked down the solution looking through the boost libraries for code that returns an iterator.

Monday, May 19, 2008

Pargen for YACC and LEX

The YACC and LEX libraries that Cello links against come from Pargen. Pargen is a YACC and LEX distribution for use on Windows. Cello was supplied with the Pargen libraries without source. Fortunately Bumble Bee Software have the sources to their previous release of their Pargen library available for download. Checking the source headers against the headers that were included with the Cello code I can see that they are the same. The Pargen library compiled almost without problem.

Sunday, May 18, 2008

Replacing GetKeys with CGEventSourceKeyState

Working through some of the ECMA (JavaScript) interpreter there is a line of code that tests for the escape key being held down in order to abort execution.
if ((::GetAsyncKeyState(VK_ESCAPE) & 0x8000))
{
  :
}
On the Mac the equivalent UI convention is the command-period. Looking at the world through old Carbon programmer eyes I would probably use GetKeys. In the brave new world of Cocoa replete with the Animal Farm bleat of "Cocoa good, Carbon bad" GetKeys is deprecated. I could use it but I'd rather not.

It took a little hunting but the replacement for GetKeys is CGEventSourceKeyState. This function will return if a given key (expressed as a virtual key code) is depressed. Virtual key codes are defined in HIToolbox/Events.h.

So the function drops out quite nicely:
bool IsCommandPeriodPressed()
{
  CGEventSourceStateID eventSource
    = kCGEventSourceStateCombinedSessionState;

  return ::CGEventSourceKeyState(eventSource, kVK_Command)
    && ::CGEventSourceKeyState(eventSource, kVK_ANSI_Period);
}
As a postscript, and for the avoidance of doubt, I have no beef with Cocoa; Cocoa is good and fun as well as being splendid bed-time beverage for children and adults alike. After all I will be using pure Cocoa for the Cello UI - it is just the Cocoa religion thing that is a bit scary.

Pointers to Methods in C++

Working through the ECMA script (JavaScript) interpreter I had some problems compiling a dispatch table that used function pointers. The function pointers were to non-static object methods. My gut reaction was that this must be wrong and a non-standard MSDN/Visual C++ extension. Having worked with C++ for a long time you sort of assume that you know the beast.

I briefly started converting the methods to static methods that passed in the object this in as the first parameter before I stepped back and looked at the problem properly. Function pointers to class methods are a part of the C++ standard. It required a very light touch to get the code to compile correctly. Here is a stolen code sample illustrating pointers to methods. I have modified it slightly to include a typedef for the method pointer.

class Adder
{
int x;
public:
Adder(int in) : x(in) { }
int addx(int y) { return x + y; }

// declare MethodPointer
typedef int (Adder::*MethodPointer)(int);
};

int main()
{
// pointer to the addx method
Adder::MethodPointer new_function
= &Adder::addx;

// Create an Adder instance
Adder foo = Adder(3);

// Call the method
std::cout << (foo.*new_function)(5)
<< std::endl; // prints "8"
return 0;
}
As ever I stand back staggered by what is out there on the internet. Wikipedia has an excellent section on function pointers here - and amazingly there is/was a site http://www.function-pointer.org/ dedicated to explanations of function pointers.

Saturday, May 17, 2008

JavaScript with YACC and LEX

Pressing on with Cello I have started compiling the files that are responsible for the implementation of ECMA script (JavaScript).

Inheriting a code base is a whole set of surprises one of which is that Cello has its own JavaScript interpreter built in. I have some experience of working with SpiderMonkey and have looked at the JavaScript interpreter that is built into WebKit. Implementing a JavaScript interpreter is not a small job - and as with most of the Cello code it is well written and easy to understand. Down the line when everything is working (quite a distant point) I may remove it and instead use the WebKit implementation of JavaScript - I don't have any immediate desire to do this - instead I am very much following the path of least resistance.

At the moment the easiest route is to get it to compile and build. Thus far, working through the code file by file, the code compiles with few errors - typically four or five per file and I have so many clean runs (files with no errors) that I look forward to the ones with problems to fix. The biggest issues I have had (so far) is the absence of _fpclass, but the equivalent fpclassify seems to fit the bill.

For fun and games down the line Cello links against Bumble-Bee Software's Pargen YACC and Lex libraries. The source of these libraries is available (though not included in my Cello sources). Building these seems quite possible.

Thursday, May 15, 2008

File & Related Classes

All of the file & related classes in Cello now compile. So I have a folder full of files 'ticked off'. This part of the project took longer and was more painful than I had imagined - simply because structured storage was an unexpected wrinkle.

POLE, the structured storage library I have deployed will only read structured storage files and won't write them. Looking at the code there seems to have been a reasonable start in getting the writing part of POLE completed but it is not close to being finished. It is something that I am reluctant to undertake. POLE amongst other things was the only non-GPL C/C++ library that I found.

The initial part of the Cello port is to get enough together to open files generated on the PC version and for the code to run to generate the output. So to deal with the issue of writing I have
'stubbed-out' all of Cello that butts onto the structured storage writing and to defer the issue. Though it would be nice to be able to write file in the existing format it is not essential.

Sunday, May 11, 2008

Boost - Core Foundation

I have started plumbing in some of Core Foundation into Cello. Core Foundation objects are owner-counted which leaves you to take care of calling CFRetain and CFRelease at appropriate moments. It is very straight forward except that adding all of the CFRetain and CFRelease clutters your code and (more importantly) can be tricky to get right - it is really easy to forget a release, especially if you consider exceptions. .

Things are eased a little if you have classes that wrap some of the Core Foundation objects - but the thing that really makes the difference is smart pointers.

In a previous life I used Apple Class Suites (ACS) which was a part of MacApp. ACS wrapped most of the underlying OS APIs in good C++ and introduced me to using smart pointers that would automatically call CFRetain and CFRelease. So if you wanted a pointer to a CFString that would automatically call CFRelease when it went out of scope there was an AutoCFString_AC smart pointer.

There is excellent support for smart pointers in boost. Declaring and implementing a smart pointer using intrusive_ptr to do the same as MacApp's AutoCFString_AC takes just three lines of code in boost:
 typedef boost::intrusive_ptr auto_CFString;
inline void intrusive_ptr_add_ref(CFStringRef p)
{ ::CFRetain(p); }
inline void intrusive_ptr_release(CFStringRef p)
{ ::CFRelease(p); }
For those curious in the MacApp annex of history, MacApp though 'dodo dead' has played an important part in the development of many application arameworks, there is an insightful article in wikipedia here .

Saturday, May 10, 2008

Boost - Standing on Giants' Shoulders

I have just downloaded the latest release of Boost and checked it into the CVS repository for the Cello project. Boost starts where STL leaves off - and allows you, as a programmer, to stand on the shoulders of giants.

Boost is huge - checking it in manually once took a friend of mine the best part of a day (CVS won't let you check in recursively). I was fortunate to find the following commands that will recursively add to CVS under OSX in Torsten Curdt's blog.
find . -type d -print | grep -v CVS | xargs -n1 cvs add
find . -type f -print | grep -v CVS | xargs -n1 cvs add
The first command adds the folders from the current directory - the second one adds the files. The standing on Torsten Curdt's shoulders saved a huge amount of time.

The other thing you can do is to just deploy the parts of boost you are using Alex Ott's blog details this.

The MFC version of Cello does not use Boost (the code is reasonably old) - I decided to add it to the project as I need to have some objects that are managed with owner counts and shared_ptr is ready rolled and works well.

Side by Wrinkled Side - VMWare Fusion

I have jut installed VMWare Fusion. This is an almost magical piece of software that means that you can run Windows at the same time as OSX. The s/w has some trickery that means that your XP (and I am sure Vista) Windows can be scarily interleaved with your Mac windows should you want it. The whole mish-mash is a little strange (and behaves a little strangely) but it means that you can run Visual Studio at the same time as XCode - and conveniently do things like copy/paste between them.

Later in the MFC->OSX port when I will be debugging I think that being able to run XCode and Visual Studio 'side by wrinkled side ... like two old kippers in a box' will save a huge amount of time.

Friday, May 9, 2008

Testing and POLE Dancing

Pressing on with deploying POLE I am starting to look at Unit testing. The idea is to be able to test the new iterator before deploying it. Ideas like this are nothing new in the programing world - but coming from a project that is old, a little out of control and never had any unit testing it is good to deploy a little rigor. What I am hoping to gain from this is to eliminate unknowns and upfront and remove some of the uncertainties with POLE.

XCode comes with two built in unit test frameworks CPlusTest for C & C++, and OCUnit for Objective-C. There are other unit test frameworks around, and I note a particularly good post in Dave Dribin's Blog. I have elected to go with the CPlusTest despite Dave Dribin's reservations - the integration with XCode is good and I also plan use OCUnit when I come to start work on the UI.

CPlusTest is horribly easy to use - you set up a new target in your XCode project just choose Unit test Bundle from the list of possible targets, and then choose the C++ Test Case when you add a new file. XCode drops in a stub class to fill in. You have to register our test cases in the file - it is basic boiler plate stuff and Apple's documentation is quite clear. I only mention these steps to give an idea of how simple it is.

With CPlusTest you choose when you want the test cases to be run - they will either run every time you build your main target or just as a part of the building the test target. Test failures are logged as errors in your test file as were they compiler errors. This struck me as a little odd but it is quite convenient as you are reasonably close to the problem area when you get a test failure - you can command-double-click through the identifiers to where you need to be quite quickly.

The last unit testing I did was many years ago with a particularly grim piece of code called The Test Harness back in the days where real men spat Fortran - the ease with which unit testing can be deployed deployed now really is a pleasure.

Tuesday, May 6, 2008

Structured Storage

I have got through the first set of utilities and have started on the nitty-gritty of the file operations. A first look at a file called DWEditFile it seems like Cello uses Structured Storage. Coming from OSX this was the first time I have come across this technology, which is proprietary to Microsoft, and not available under OSX.

Usefull Liks:
  1. Mictosoft's Structured Storage Reference
  2. An Introduction to Structured Storage

There are a couple of open source projects that will read and write Structured Storage documents.
  1. GNOME - Structured File Library (GPL)
  2. POLE - Portable C++ library to access OLE Storage (BSD)

POLE seems to very lightweight, and straight forward. Looking through the code, on the surface, it lacks:
  1. Some exception handling.
  2. A copy operation - for copying bits of one document into another. This is something that Cello uses.
Neither of these seem that difficult to implement. The trick to it would seem to build a test bed to create and test the document - including the new copy operation.

The final thing that is different is that the way of accessing different parts of the document (structured documents are hierarchical) using POLE is very different to the windows. MFC seems to have iterators that will iterate through a given part of the document structure. I wonder if the best way to deal with this would to build an iterator like the MFC one on top of POLE - which minimizes the changes that will need to be made to to Cello. Also I can write a separate test code for this.

After doing the easy part of deploying POLE I had some problems understanding how the existing code worked and how to port it onto POLE. Strangely the DWEditFile is not the main document file (a bad assumption on my part) - the thing now is to figure out where it is used. It could be a skeleton.

Monday, May 5, 2008

Preferences and files that will never be ported

While working through the first set of Cello utility files I came across a couple of files - the first was DWNamedTreeStorage an XML reader and writer class. I fixed the compile errors (small string-type stuff) and moved on. It occurred that I could implement whatever there is in the file with some of the Core Foundation which has good support for reading and writing XML. However that is deviating from the game-plan. The game-plan is to get a subset (just reading files and generating the output file) as quickly as possible. Everything in Cello running on Windows works and it works well - there are many other wheels looking for reinvention.

Following this was a file called DWNamedTreeStorageRegistry. There was a fair amount in this file that would not compile and much of it was very OS specific - stepping back and looking where it was used - I found that it is used to save and load preferences from the registry. I have moved this file off the hit-list and will implement it's functionality after using CFPreferences when I come to the client code that uses it. It may well be that DWNamedTreeStorage is never used.

To keep a record of files that will never be ported - but whose functionality will have to be otherwise implemented I have added a text file called 'BadList' to each library.

Sunday, May 4, 2008

CVS and Cutting the first Strings

I have created a new CVS repository and checked in the original sources. CVS is not the best control control system but it is not bad and does the job. I had at the outset thought about getting the project professionally hosted - there is some good Trac SubVersion hosting out there - but for the moment I will save the money.

I have started working through some of the utility classes and getting them to build. The code is well organized, structured and generally well commented. Coming from a long legacy project which in reality is a bit out of control working with code like this is a real breath of fresh air. I was fortunate enough to have spent a day looking at the code before buying into it, during that initial review I could see that it was well written and of a high quality. The most daunting aspect of it all is the shear quantity - there are hundreds and hundreds of files.

My first step is just to get things to compile. There are a few fun and games with templates as gcc is quite strict. There are (obviously) a lot of types defined in MFC - things like BYTE DWORD etc. I am making some stub headers with the minimum of declarations in them.

I have taken the MFC CString class and turned it into a wrapper for a CFStringRef - the code has it's own string class and it occurs to me that down the road I might consolidate them. The windows version of Cello is not Unicode which is something that down the line I will want to change. Making CString a wrapper for CFStringRef is a start - and those parts of the interface to CString that are char * I am scoping in a #ifdefs so that down the line I can flip the define, have my build break and work through the the compiler errors. The other thing is that the CFString is when I come to the UI a NSString - ready for cocoa.

I have after a good days work got 8 files compiling - the first took half a day but it gets quicker!

Saturday, May 3, 2008

NASM another way out?

The .intel_syntax directive is not supported under the XCode Gas assembler. This is not an option.

In searching for information for converting MASM to Gas I have found that, as I understand it, most assembly work is actually done on NASM, and indeed this assembler has been available as part of he default XCode installation for a long time. Gas, it seems, is primarily designed as an assembler for GCC rather than as a stand-alone assembler. NASM is also closer to MASM than Gas.

MASM assembler still needs to be converted to NASM. I found a Perl program Nomyso that will do a fair bit of he work. It looks far promising than intel2gas which proved to be very disappointing.

Problems using Nomyso:
  1. Handling of structs seems wrong
  2. Included files are copied into the resulting file rather than being converted to %include.

Some useful links:
  1. A good survey of Assemblers
  2. Apple NASM Man Page
  3. NASM Documentation
  4. Writing A Useful Program With NASM

For the moment I am going to put the business of getting jcalg to assemble under OSX/XCode to one side as I can work around it.

Friday, May 2, 2008

jcalg1 - MASM and Gas Attack

Naïvely I assumed that assembler was assembler - this assumption like the best assumptions was based on ignorance. I have done assembler back in college days college on old PDP-11, later in 6502 and a little 68000 from this it never occurred that different assemblers (with the same target language) could be so very different. As jcag1 compiles under MASM - and XCode is Gas (gcc) they are quite different. My attempts to Google-up a gas version of jcalg1 did not yield.

The options from here seem to be:
  1. Convert jcag1 to Gas. I have found a tool that claims to be able to do this intel2gas - but there is a large amount that it just can't handle.
  2. Use the .intel_syntax directive in Gas and fix up all the errors manually. This directive makes gas use the AT&T syntax which is closer to MASM.
  3. Modify my approach - change the original windows version of Cello so that it saves and opens files that are not compressed and thereby just side-step this problem. This would allow me to still create sample files.
Of these approaches the last will obviously work - however I will have a crack at the .intel_syntax directive and see if it has millage.

Thursday, May 1, 2008

Initial plan of attack

My plan of attack to get Cello out of the water is to port the file operations first. The first goal is to have something running on the Mac that can read a file (sample files generated from the Windows version) and then to be able to create the results (also a file). After that I can look at the visual representation of the file (getting it all to draw), the UI and all of the rest of it.

The program is written almost entirely in C++, so I figured that my start would be to map the file operations (etc.) to Core Foundation. Looking at the code, and from the brief tour of it from one of the Windows engineers Cello reads and writes compressed files. It uses a library called jcalg1 which is written in assembler. So the first dirty hands will be getting it to assemble in XCode.