OpenType Layout in PDF

Hindi, Swash capitals and more

PDF Documents are mostly text, and the PDF Library has been able to write text since day one, with a simple layout algorithm that moves left to right with kerning (or right to left, with special shaping rules for Arabic). Suitably for such a heavily used algorithm, it's extremely quick.

However there are a few languages the algorithm can't handle. Hindi, Bengali, Kannada and other Indic languages have very complex shaping rules, and I've no doubt there are others too. These rules are usually defined in special layout tables defined inside the OpenType font, and since PDF Library version 2.11.22 we can optionally the algorithm defined in these tables for text layout.

Hello दुनिया

Here's a quck example showing some Hindi. You'll need to download an appropriate font - we're using Gurumaa which is a free download, but any Unicode Devanagari font should work.

import java.util.*;
import java.io.*;
import java.awt.Color;
import org.faceless.pdf2.*;

public class HelloWorld {
  public static void main(String[] args) throws IOException {
    PDF pdf = new PDF();
    PDFPage page = pdf.newPage("A4");

    PDFStyle style = new PDFStyle();
    InputStream in = new FileInputStream("gurumaa-150.ttf");
    PDFFont font = new OpenTypeFont(in, 2);
    in.close();

    String text = "\u0939\u093f\u0928\u094d\u0926\u0940"; // हिन्
    font.setFeature("opentype", true);

    style.setFont(font, 24);
    style.setFillColor(Color.black);

    page.setStyle(style);
    page.drawText(text, 100, page.getHeight()-100);

    OutputStream fo = new FileOutputStream("HelloWorld.pdf");
    pdf.render(fo);
    fo.close();
  }
}

The highlighted lines show the changes from a regular "HelloWorld" in english: the font must be loaded as an OpenTypeFont with "2" as the second argument, and you must call the setFeature method on the font to turn on the "opentype" feature. And that's it.

Not just for Indic

Setting the opentype feature on switched the internal layout engine - it's not on by default however, because it's considerably slower. So generally we'd recommend leaving it off unless you need its features. But what else can it do?

The primary use for these OpenType tables is for Indic and Arabic langauges (although our own arabic layout engine is much faster and on by default, so there's no need to switch for arabic). There are a few fonts out there that contain special layout features for OpenType. One of our favourites is Meglopolis which certainly gave us some headaches during testing. To see which features the font has available, print out the list of features returned by PDFFont.getAvailableFeatures(). This gives the following list:

opentype, opentype.fina, opentype.liga, opentype.dlig, opentype.kern, opentype.aalt, opentype.case, opentype.dnom, opentype.frac, opentype.lnum, opentype.locl, opentype.numr, opentype.onum, opentype.ordn, opentype.ornm, opentype.pnum, opentype.salt, opentype.sinf, opentype.ss01, opentype.ss02, opentype.ss03, opentype.ss04, opentype.ss05, opentype.ss06, opentype.sups, opentype.tnum, latinligatures

To decipher this you'll need the list of features, which tells us we have Discretionary Ligatures (dlig), Fraction Forms (frac), the fantastically named Scientific Inferiors (sinf) and others. The descriptions are a bit dry, but a picture tells a thousand words, so we're going to assume you know which features you want to use. Turn on the features you want

String text = "EXTRAORDINAIRE H2O";
font.setFeature("opentype", true);
font.setFeature("opentype.dlig", true);
font.setFeature("opentype.sinf", true);
and then see the results.

Report Generator

Enabling these features is just as simple in the Report Generator. The new font-feature-settings attribute can be set to turn features on or off for any element. Combining our example above, we could do this in the Report Generator like this:

<?xml version="1.0"?>
<!DOCTYPE pdf PUBLIC "-//big.faceless.org//report" "report-1.1.dtd">
<pdf>
<head>
<link name="megalopolis" type="font" subtype="opentype" src="MEgalopolisExtra.otf" bytes="2" />
<link name="gurumaa" type="font" subtype="opentype" src="gurumaa-150.ttf" bytes="2" />
<style>
 body        { font-size:24pt }
 h1          { font:megalopolis; font-feature-settings:opentype,opentype.dlig,opentype.sinf }
 *:lang(hi)  { font-family:gurumaa; font-feature-settings:opentype }
</style>
</head>
<body>
 <h1>EXTRAORDINAIRE H2O</h1>
 <p>
  The word for "World" in Hindi is <span lang="hi">दुनिया</span>.
 </p>
</body>
</pdf>

Finally a real use for the CSS :lang selector! This should get you started, and we look forward to seeing plenty of swash capitals in your testcases over the next few months.