% This is ACCENTS.WEB in text format, as of April 26, 1991. % Copyright (C) 1991 Jiri Zlatuska (zlatuska at cspuni12.bitnet) % % This program is free software; you can redistribute it and/or modify % it under the terms of the GNU General Public License as published by % the Free Software Foundation; either version 1, or (at your option) % any later version. % % You should have received a copy of the GNU General Public License % along with this program; if not, write to the Free Software % Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. % % Version 1 was implemented in April 1991. % Here is TeX material that gets inserted after \input webmac \def\hang{\hangindent 3em\indent\ignorespaces} \font\ninerm=cmr9 \let\mc=\ninerm % medium caps for names like SAIL \def\PASCAL{Pascal} \def\(#1){} % this is used to make section names sort themselves better \def\9#1{} % this is used for sort keys in the index \def\title{ACCENTS} \def\contentspagenumber{101} \def\topofcontents{\null \def\titlepage{F} % include headline on the contents page \def\rheader{\mainfont\hfil \contentspagenumber} \vfill \centerline{\titlefont The {\ttitlefont ACCENTS} processor} \vskip 5pt \centerline{Copyright \copyright\ 1991 Ji\v r\'\i\ Zlatu\v ska} \centerline{Distributed under terms of GNU General Public License} \vskip 15pt \centerline{(Version 1, April 1991)} \vfill} \def\botofcontents{\vfill \centerline{\hsize 5in\baselineskip9pt \vbox{\ninerm\noindent This program was developed at the Institute of Computer Science of Masaryk University, Brno, Czechoslovakia. `\TeX' is a trademark of the American Mathematical Society.}}} \pageno=\contentspagenumber \advance\pageno by 1 @* Introduction. The \.{ACCENTS} utility program generates a virtual font (``\.{VF}'') file and its associated \TeX\ font metric (``\.{TFM}'') file containing character codes for accented letters arranged according to the \.{KOI8-CS} character table layout; the input being the \.{TFM} file of the source (unaccented) font. It also makes a thorough check of the given files, using algorithms that are essentially the same as those used by \.{DVI} device drivers and by \TeX. Thus \TeX\ or a \.{DVI} driver should never complain that the resulting \.{TFM} or \.{VF} file is ``bad''. \.{ACCENTS} can recognize \.{ADOBE} standard encoding scheme, and provides re-arranging of the characters into \TeX\ text font layout with just the relevant characters selected. \indent\.{ACCENTS} is based on the input part of \.{VFtoVP}, and the output part of \.{VPtoVF} programs, which are parts of the standard \TeX ware library. The |banner| string defined here should be changed whenever \.{ACCENTS} gets modified. @d banner=='This is ACCENTS, Version 1' {printed when the program starts} @d copyright=='Copyright (C) 1990 Jiri Zlatuska' @ This program is written entirely in standard \PASCAL, except that it occasionally has lower case letters in strings that are output. Such letters can be converted to upper case if necessary. The input is read from |T_tfm_file|, and |adj_file|; the output is written on |tfm_file| and |vf_file|. Error messages and other remarks are written on the |output| file, which the user may choose to assign to the terminal if the system permits it. @^system dependencies@> The term |print| is used instead of |write| when this program writes on the |output| file, so that all such output can be easily deflected. @d print(#)==write(#) @d print_ln(#)==write_ln(#) @p program ACCENTS(@!T_tfm_file,@!vf_file,@!adj_file,@!tfm_file,@!output); label @@/ const @@/ type @@/ var @@/ procedure initialize; {this procedure gets things started properly} var @!k:integer; {all-purpose index for initialization} @@/ begin print_ln(banner);@/ print_ln(copyright); print_ln('Distributed under terms of GNU General Public License');@/ @@/ end; @ If the program has to stop prematurely, it goes to the `|final_end|'. @d final_end=9999 {label for the end of it all} @=final_end; @ The following parameters can be changed at compile time to extend or reduce \.{ACCENTS}'s capacity. @= @!T_tfm_size=30000; {maximum length of |tfm| data, in bytes} @!T_lig_size=5000; {maximum length of |lig_kern| program, in words} @!max_header_bytes=100; {four times the maximum number of words allowed in the \.{TFM} file header block, must be 1024 or less} @!vf_size=10000; {maximum length of |vf| data, in bytes} @!max_stack=100; {maximum depth of simulated \.{DVI} stack} @!max_param_words=30; {the maximum number of \.{fontdimen} parameters allowed} @!max_lig_steps=5000; {maximum length of ligature program, must be at most $32767-257=32510$} @!max_kerns=500; {the maximum number of distinct kern values} @!hash_size=5003; {preferably a prime number, a bit larger than the number of character pairs in lig/kern steps} @!name_length=50; {a file name shouldn't be longer than this} @!buf_size=60; {length of lines displayed in error messages} @ Here are some macros for common programming idioms. @d incr(#) == #:=#+1 {increase a variable by unity} @d decr(#) == #:=#-1 {decrease a variable by unity} @d do_nothing == {empty statement} @d exit=10 {go here to leave a procedure} @d unfound=45 {go here when you've found nothing} @d return==goto exit {terminate a procedure call} @f return==nil @ We assume that |case| statements may include a default case that applies if no matching label is found. Thus, we shall use constructions like @^system dependencies@> $$\vbox{\halign{#\hfil\cr |case x of|\cr 1: $\langle\,$code for $x=1\,\rangle$;\cr 3: $\langle\,$code for $x=3\,\rangle$;\cr |othercases| $\langle\,$code for |x<>1| and |x<>3|$\,\rangle$\cr |endcases|\cr}}$$ since most \PASCAL\ compilers have plugged this hole in the language by incorporating some sort of default mechanism. For example, the compiler used to develop \.{WEB} and \TeX\ allows `|others|:' as a default label, and other \PASCAL s allow syntaxes like `\ignorespaces|else|\unskip' or `\&{otherwise}' or `\\{otherwise}:', etc. The definitions of |othercases| and |endcases| should be changed to agree with local conventions. (Of course, if no default mechanism is available, the |case| statements of this program must be extended by listing all remaining cases. Donald~E. Knuth, the author of the \.{WEB} system program \.{TANGLE}, @^Knuth, Donald Ervin@> would have taken the trouble to modify \.{TANGLE} so that such extensions were done automatically, if he had not wanted to encourage \PASCAL\ compiler writers to make this important change in \PASCAL, where it belongs.) @d othercases == others: {default for cases not listed explicitly} @d endcases == @+end {follows the default case in an extended |case| statement} @f othercases == else @f endcases == end @= @!byte=0..255; {unsigned eight-bit quantity} @* Font metric data. The idea behind \.{TFM} files is that typesetting routines like \TeX\ need a compact way to store the relevant information about several dozen fonts, and computer centers need a compact way to store the relevant information about several hundred fonts. \.{TFM} files are compact, and most of the information they contain is highly relevant, so they provide a solution to the problem. The information in a \.{TFM} file appears in a sequence of 8-bit bytes. Since the number of bytes is always a multiple of 4, we could also regard the file as a sequence of 32-bit words; but \TeX\ uses the byte interpretation, and so does \.{ACCENTS} at this point. Note that the bytes are considered to be unsigned numbers. @= @!T_tfm_file:packed file of byte; @ On some systems you may have to do something special to read a packed file of bytes. For example, the following code didn't work when it was first tried at Stanford, because packed files have to be opened with a special switch setting on the \PASCAL\ that was used. @^system dependencies@> @= reset(T_tfm_file); @ The first 24 bytes (6 words) of a \.{TFM} file contain twelve 16-bit integers that give the lengths of the various subsequent portions of the file. These twelve integers are, in order: $$\vbox{\halign{\hfil#&$\null=\null$#\hfil\cr |@!lf|&length of the entire file, in words;\cr |@!lh|&length of the header data, in words;\cr |@!bc|&smallest character code in the font;\cr |@!ec|&largest character code in the font;\cr |@!nw|&number of words in the width table;\cr |@!nh|&number of words in the height table;\cr |@!nd|&number of words in the depth table;\cr |@!ni|&number of words in the italic correction table;\cr |@!nl|&number of words in the lig/kern table;\cr |@!nk|&number of words in the kern table;\cr |@!ne|&number of words in the extensible character table;\cr |@!np|&number of font parameter words.\cr}}$$ They are all nonnegative and less than $2^{15}$. We must have |bc-1<=ec<=255|, |ne<=256|, and $$\hbox{|lf=6+lh+(ec-bc+1)+nw+nh+nd+ni+nl+nk+ne+np|.}$$ Note that a font may contain as many as 256 characters (if |bc=0| and |ec=255|), and as few as 0 characters (if |bc=ec+1|). Incidentally, when two or more 8-bit bytes are combined to form an integer of 16 or more bits, the most significant bytes appear first in the file. This is called BigEndian order. @= @!T_lf,@!T_lh,@!T_bc,@!T_ec,@!T_nw,@!T_nh,@!T_nd,@!T_ni,@!T_nl, @!T_nk,@!T_ne,@!T_np:0..@'77777; {subfile sizes} @ The rest of the \.{TFM} file may be regarded as a sequence of ten data arrays having the informal specification $$\def\arr$[#1]#2${\&{array} $[#1]$ \&{of} #2} \vbox{\halign{\hfil\\{#}&$\,:\,$\arr#\hfil\cr header&|[0..lh-1]stuff|\cr char\_info&|[bc..ec]char_info_word|\cr width&|[0..nw-1]fix_word|\cr height&|[0..nh-1]fix_word|\cr depth&|[0..nd-1]fix_word|\cr italic&|[0..ni-1]fix_word|\cr lig\_kern&|[0..nl-1]lig_kern_command|\cr kern&|[0..nk-1]fix_word|\cr exten&|[0..ne-1]extensible_recipe|\cr param&|[1..np]fix_word|\cr}}$$ The most important data type used here is a |@!fix_word|, which is a 32-bit representation of a binary fraction. A |fix_word| is a signed quantity, with the two's complement of the entire word used to represent negation. Of the 32 bits in a |fix_word|, exactly 12 are to the left of the binary point; thus, the largest |fix_word| value is $2048-2^{-20}$, and the smallest is $-2048$. We will see below, however, that all but one of the |fix_word| values will lie between $-16$ and $+16$. @ The first data array is a block of header information, which contains general facts about the font. The header must contain at least two words, and for \.{TFM} files to be used with Xerox printing software it must contain at least 18 words, allocated as described below. When different kinds of devices need to be interfaced, it may be necessary to add further words to the header block. \yskip\hang|header[0]| is a 32-bit check sum that \TeX\ will copy into the \.{DVI} output file whenever it uses the font. Later on when the \.{DVI} file is printed, possibly on another computer, the actual font that gets used is supposed to have a check sum that agrees with the one in the \.{TFM} file used by \TeX. In this way, users will be warned about potential incompatibilities. (However, if the check sum is zero in either the font file or the \.{TFM} file, no check is made.) The actual relation between this check sum and the rest of the \.{TFM} file is not important; the check sum is simply an identification number with the property that incompatible fonts almost always have distinct check sums. @^check sum@> \yskip\hang|header[1]| is a |fix_word| containing the design size of the font, in units of \TeX\ points (7227 \TeX\ points = 254 cm). This number must be at least 1.0; it is fairly arbitrary, but usually the design size is 10.0 for a ``10 point'' font, i.e., a font that was designed to look best at a 10-point size, whatever that really means. When a \TeX\ user asks for a font `\.{at} $\delta$ \.{pt}', the effect is to override the design size and replace it by $\delta$, and to multiply the $x$ and~$y$ coordinates of the points in the font image by a factor of $\delta$ divided by the design size. {\sl All other dimensions in the\/\ \.{TFM} file are |fix_word|\kern-1pt\ numbers in design-size units.} Thus, for example, the value of |param[6]|, one \.{em} or \.{\\quad}, is often the |fix_word| value $2^{20}=1.0$, since many fonts have a design size equal to one em. The other dimensions must be less than 16 design-size units in absolute value; thus, |header[1]| and |param[1]| are the only |fix_word| entries in the whole \.{TFM} file whose first byte might be something besides 0 or 255. @^design size@> \yskip\hang|header[2..11]|, if present, contains 40 bytes that identify the character coding scheme. The first byte, which must be between 0 and 39, is the number of subsequent ASCII bytes actually relevant in this string, which is intended to specify what character-code-to-symbol convention is present in the font. Examples are \.{ASCII} for standard ASCII, \.{TeX text} for fonts like \.{cmr10} and \.{cmti9}, \.{TeX math extension} for \.{cmex10}, \.{XEROX text} for Xerox fonts, \.{GRAPHIC} for special-purpose non-alphabetic fonts, \.{UNSPECIFIED} for the default case when there is no information. Parentheses should not appear in this name. (Such a string is said to be in {\mc BCPL} format.) @^coding scheme@> \yskip\hang|header[12..16]|, if present, contains 20 bytes that name the font family (e.g., \.{CMR} or \.{HELVETICA}), in {\mc BCPL} format. This field is also known as the ``font identifier.'' @^family name@> @^font identifier@> \yskip\hang|header[17]|, if present, contains a first byte called the |seven_bit_safe_flag|, then two bytes that are ignored, and a fourth byte called the |face|. If the value of the fourth byte is less than 18, it has the following interpretation as a ``weight, slope, and expansion'': Add 0 or 2 or 4 (for medium or bold or light) to 0 or 1 (for roman or italic) to 0 or 6 or 12 (for regular or condensed or extended). For example, 13 is 0+1+12, so it represents medium italic extended. A three-letter code (e.g., \.{MIE}) can be used for such |face| data. \yskip\hang|header[18..@twhatever@>]| might also be present; the individual words are simply called |header[18]|, |header[19]|, etc., at the moment. @ Next comes the |char_info| array, which contains one |char_info_word| per character. Each |char_info_word| contains six fields packed into four bytes as follows. \yskip\hang first byte: |width_index| (8 bits)\par \hang second byte: |height_index| (4 bits) times 16, plus |depth_index| (4~bits)\par \hang third byte: |italic_index| (6 bits) times 4, plus |tag| (2~bits)\par \hang fourth byte: |remainder| (8 bits)\par \yskip\noindent The actual width of a character is |width[width_index]|, in design-size units; this is a device for compressing information, since many characters have the same width. Since it is quite common for many characters to have the same height, depth, or italic correction, the \.{TFM} format imposes a limit of 16 different heights, 16 different depths, and 64 different italic corrections. Incidentally, the relation |width[0]=height[0]=depth[0]=italic[0]=0| should always hold, so that an index of zero implies a value of zero. The |width_index| should never be zero unless the character does not exist in the font, since a character is valid if and only if it lies between |bc| and |ec| and has a nonzero |width_index|. @ The |tag| field in a |char_info_word| has four values that explain how to interpret the |remainder| field. \yskip\hang|tag=0| (|no_tag|) means that |remainder| is unused.\par \hang|tag=1| (|lig_tag|) means that this character has a ligature/kerning program starting at |lig_kern[remainder]|.\par \hang|tag=2| (|list_tag|) means that this character is part of a chain of characters of ascending sizes, and not the largest in the chain. The |remainder| field gives the character code of the next larger character.\par \hang|tag=3| (|ext_tag|) means that this character code represents an extensible character, i.e., a character that is built up of smaller pieces so that it can be made arbitrarily large. The pieces are specified in |exten[remainder]|.\par @d no_tag=0 {vanilla character} @d lig_tag=1 {character has a ligature/kerning program} @d list_tag=2 {character has a successor in a charlist} @d ext_tag=3 {character is extensible} @ The |lig_kern| array contains instructions in a simple programming language that explains what to do for special letter pairs. Each word is a |lig_kern_command| of four bytes. \yskip\hang first byte: |skip_byte|, indicates that this is the final program step if the byte is 128 or more, otherwise the next step is obtained by skipping this number of intervening steps.\par \hang second byte: |next_char|, ``if |next_char| follows the current character, then perform the operation and stop, otherwise continue.''\par \hang third byte: |op_byte|, indicates a ligature step if less than~128, a kern step otherwise.\par \hang fourth byte: |remainder|.\par \yskip\noindent In a kern step, an additional space equal to |kern[256*(op_byte-128)+remainder]| is inserted between the current character and |next_char|. This amount is often negative, so that the characters are brought closer together by kerning; but it might be positive. There are eight kinds of ligature steps, having |op_byte| codes $4a+2b+c$ where $0\le a\le b+c$ and $0\le b,c\le1$. The character whose code is |remainder| is inserted between the current character and |next_char|; then the current character is deleted if $b=0$, and |next_char| is deleted if $c=0$; then we pass over $a$~characters to reach the next current character (which may have a ligature/kerning program of its own). Notice that if $a=0$ and $b=1$, the current character is unchanged; if $a=b$ and $c=1$, the current character is changed but the next character is unchanged. \.{ACCENTS} will check to see that infinite loops are avoided. If the very first instruction of the |lig_kern| array has |skip_byte=255|, the |next_char| byte is the so-called right boundary character of this font; the value of |next_char| need not lie between |bc| and~|ec|. If the very last instruction of the |lig_kern| array has |skip_byte=255|, there is a special ligature/kerning program for a left boundary character, beginning at location |256*op_byte+remainder|. The interpretation is that \TeX\ puts implicit boundary characters before and after each consecutive string of characters from the same font. These implicit characters do not appear in the output, but they can affect ligatures and kerning. If the very first instruction of a character's |lig_kern| program has |skip_byte>128|, the program actually begins in location |256*op_byte+remainder|. This feature allows access to large |lig_kern| arrays, because the first instruction must otherwise appear in a location |<=255|. Any instruction with |skip_byte>128| in the |lig_kern| array must have |256*op_byte+remainder= @!T_index=0..T_tfm_size; {address of a byte in |tfm|} @ @= @!T_tfm:array [-1000..T_tfm_size] of byte; {the \.{TFM} input data all goes here} {the negative addresses avoid range checks for invalid characters} @ The input may, of course, be all screwed up and not a \.{TFM} file at all. So we begin cautiously. @d abort(#)==begin print_ln(#); print_ln('Sorry, but I can''t go on; are you sure this is a TFM?'); goto final_end; end @d message(#)==begin if chars_on_line>0 then print_ln(' '); print_ln(#); end @= read(T_tfm_file,T_tfm[0]); if T_tfm[0]>127 then abort('The first byte of the input file exceeds 127!'); @.The first byte...@> if eof(T_tfm_file) then abort('The input file is only one byte long!'); @.The input...one byte long@> read(T_tfm_file,T_tfm[1]); T_lf:=T_tfm[0]*@'400+T_tfm[1]; if T_lf=0 then abort('The file claims to have length zero, but that''s impossible!'); @.The file claims...@> if 4*T_lf-1>T_tfm_size then abort('The file is bigger than I can handle!'); @.The file is bigger...@> for T_tfm_ptr:=2 to 4*T_lf-1 do begin if eof(T_tfm_file) then abort('The file has fewer bytes than it claims!'); @.The file has fewer bytes...@> read(T_tfm_file,T_tfm[T_tfm_ptr]); end; if not eof(T_tfm_file) then begin message('There''s some extra junk at the end of the TFM file,'); @.There's some extra junk...@> message('but I''ll proceed as if it weren''t there.'); end @ After the file has been read successfully, we look at the subfile sizes to see if they check out. @d eval_two_bytes(#)==begin if T_tfm[T_tfm_ptr]>127 then abort('One of the subfile sizes is negative!'); @.One of the subfile sizes...@> #:=T_tfm[T_tfm_ptr]*@'400+T_tfm[T_tfm_ptr+1]; T_tfm_ptr:=T_tfm_ptr+2; end @= begin T_tfm_ptr:=2;@/ eval_two_bytes(T_lh); eval_two_bytes(T_bc); eval_two_bytes(T_ec); eval_two_bytes(T_nw); eval_two_bytes(T_nh); eval_two_bytes(T_nd); eval_two_bytes(T_ni); eval_two_bytes(T_nl); eval_two_bytes(T_nk); eval_two_bytes(T_ne); eval_two_bytes(T_np); if T_lh<2 then abort('The header length is only ',T_lh:1,'!'); @.The header length...@> if T_nl>4*T_lig_size then abort('The lig/kern program is longer than I can handle!'); @.The lig/kern program...@> if (T_bc>T_ec+1)or(T_ec>255) then abort('The character code range ', @.The character code range...@> T_bc:1,'..',T_ec:1,'is illegal!'); if (T_nw=0)or(T_nh=0)or(T_nd=0)or(T_ni=0) then abort('Incomplete subfiles for character dimensions!'); @.Incomplete subfiles...@> if T_ne>256 then abort('There are ',T_ne:1,' extensible recipes!'); @.There are ... recipes@> if T_lf<>6+T_lh+(T_ec-T_bc+1)+T_nw+T_nh+T_nd+T_ni+T_nl+T_nk+T_ne+T_np then abort('Subfile sizes don''t add up to the stated total!'); @.Subfile sizes don't add up...@> end @ Once the input data successfully passes these basic checks, \.{ACCENTS} believes that it is a \.{TFM} file, and the generation of the derived font will take place. Access to the various subfiles is facilitated by computing the following base addresses. For example, the |char_info| for character |c| will start in location |4*(char_base+c)| of the |tfm| array. @= @!T_char_base,@!T_width_base,@!T_height_base,@!T_depth_base,@!T_italic_base, @!T_lig_kern_base,@!T_kern_base,@!T_exten_base,@!T_param_base:integer; {base addresses for the subfiles} @ @= begin T_char_base:=6+T_lh-T_bc; T_width_base:=T_char_base+T_ec+1; T_height_base:=T_width_base+T_nw; T_depth_base:=T_height_base+T_nh; T_italic_base:=T_depth_base+T_nd; T_lig_kern_base:=T_italic_base+T_ni; T_kern_base:=T_lig_kern_base+T_nl; T_exten_base:=T_kern_base+T_nk; T_param_base:=T_exten_base+T_ne-1; end @ A |fix_word| is a 32-bit integer that represents a real value that has been multiplied by $2^{20}$. Since \.{ACCENTS} restricts the magnitude of reals to 2048, the |fix_word| will have a magnitude less than $2^{31}$. @d unity==@'4000000 {$2^{20}$, the |fix_word| 1.0} @= @!fix_word=integer; {a scaled real value with 20 bits of fraction} @ Of course we want to define macros that suppress the detail of how the font information is actually encoded. Each word will be referred to by the |tfm| index of its first byte. For example, if |c| is a character code between |bc| and |ec|, then |tfm[char_info(c)]| will be the first byte of its |char_info|, i.e., the |width_index|; furthermore |width(c)| will point to the |fix_word| for |c|'s width. @d T_check_sum=24 @d T_design_size=T_check_sum+4 @d T_scheme=T_design_size+4 @d T_family=T_scheme+40 @d T_random_word=T_family+20 @d T_char_info(#)==4*(T_char_base+#) @d T_width_index(#)==T_tfm[T_char_info(#)] @d T_nonexistent(#)==((#T_ec)or(T_width_index(#)=0)or(not valid_code[#])) @d T_height_index(#)==(T_tfm[T_char_info(#)+1] div 16) @d T_depth_index(#)==(T_tfm[T_char_info(#)+1] mod 16) @d T_italic_index(#)==(T_tfm[T_char_info(#)+2] div 4) @d T_tag(#)==(T_tfm[T_char_info(#)+2] mod 4) @d T_reset_tag(#)==T_tfm[T_char_info(#)+2]:=4*T_italic_index(#)+no_tag @d T_remainder(#)==T_tfm[T_char_info(#)+3] @d T_width(#)==4*(T_width_base+T_width_index(#)) @d T_height(#)==4*(T_height_base+T_height_index(#)) @d T_depth(#)==4*(T_depth_base+T_depth_index(#)) @d T_italic(#)==4*(T_italic_base+T_italic_index(#)) @d T_exten(#)==4*(T_exten_base+T_remainder(#)) @d T_lig_step(#)==4*(T_lig_kern_base+(#)) @d T_kern(#)==4*(T_kern_base+#) {here \#\ is an index, not a character} @d T_param(#)==4*(T_param_base+#) {likewise} @p function fix_tfm(k:integer): fix_word; var a: fix_word; {accumulator} begin a:=T_tfm[k]; if a>=128 then a:=a-256; fix_tfm:=((256*a+T_tfm[k+1])*256+T_tfm[k+2])*256+T_tfm[k+3] end; @ One of the things we would like to do is take cognizance of fonts whose character coding scheme is \.{TeX math symbols} or \.{TeX math extension}; we will set the |font_type| variable to one of the three choices |vanilla|, |mathsy|, or |mathex|. @d vanilla=0 {not a special scheme} @d mathsy=1 {\.{TeX math symbols} scheme} @d mathex=2 {\.{TeX math extension} scheme} @= @!font_type:vanilla..mathex; {is this font special?} @ The next question is, ``What are \.{VF} files?'' A complete answer to that question appears in the documentation the \.{VFtoVP} program, so the details will not be repeated here. Suffice it to say that a \.{VF} file stores all of the relevant font information in a sequence of 8-bit bytes. The number of bytes is always a multiple of 4, so we could regard the files as sequences of 32-bit words; but \TeX\ uses the byte interpretation, and so does \.{ACCENTS}. Note that the bytes are considered to be unsigned numbers. @= @!vf_file:packed file of 0..255; @!tfm_file:packed file of 0..255; @ On some systems you may have to do something special to write a packed file of bytes. For example, the following code didn't work when it was first tried at Stanford, because packed files have to be opened with a special switch setting on the \PASCAL\ that was used. @^system dependencies@> @= rewrite(vf_file); rewrite(tfm_file); @* Storing the property values. When property values have been found, they are squirreled away in a bunch of arrays. The header information is unpacked into bytes in an array called |header_bytes|. The ligature/kerning program is stored in an array of type |four_bytes|. Another |four_bytes| array holds the specifications of extensible characters. The kerns and parameters are stored in separate arrays of |fix_word| values. Virtual font data goes into an array |vf| of single-byte values. We maintain information about only one local font. Instead of storing the design size in the header array, we will keep it in a |fix_word| variable until the last minute. The number of units in the design size is also kept in a |fix_word|. @d vf_store(#)== begin vf[vf_ptr]:=#; if vf_ptr=vf_size then message('I''m out of memory---increase my vfsize!') @.I'm out of memory...@> else incr(vf_ptr); end @p procedure vf_fix(@!opcode:byte;@!x:fix_word); var negative:boolean; @!k:0..4; {number of bytes to typeset} @!t:integer; {threshold} begin if design_units<>unity then x:=round((x/design_units)*1048576.0); if x>0 then negative:=false else begin negative:=true; x:=-1-x;@+end; if opcode=0 then begin k:=4; t:=@'100000000;@+end else begin t:=127; k:=1; while x>t do begin t:=256*t+255; incr(k); end; vf_store(opcode+k-1); t:=t div 128 +1; end; repeat if negative then begin vf_store(255-(x div t)); negative:=false; x:=(x div t)*t+t-1-x; end else vf_store((x div t) mod 256); decr(k); t:=t div 256; until k=0; end; @ Four-byte values are being put into |four_bytes| records containing (yes, you guessed it) four bytes. @= @!four_bytes=record @!b0:byte;@+@!b1:byte;@+@!b2:byte;@+@!b3:byte;@+end; @ @= @!header_bytes:array[header_index] of byte; {the header block} @!header_ptr:header_index; {the number of header bytes in use} @!design_units:fix_word; {reciprocal of the scaling factor} @!seven_bit_safe_flag:boolean; {does the file claim to be seven-bit-safe?} @!lig_kern:array[0..max_lig_steps] of four_bytes; {the ligature program} @!nl:0..32767; {the number of ligature/kern instructions so far} @!min_nl:0..32767; {the final value of |nl| must be at least this} @!kern:array[0..max_kerns] of fix_word; {the distinct kerning amounts} @!nk:0..max_kerns; {the number of entries of |kern|} @!exten:array[0..255] of four_bytes; {extensible character specs} @!ne:0..256; {the number of extensible characters} @!param:array[1..max_param_words] of fix_word; {\.{FONTDIMEN} parameters} @!np:0..max_param_words; {the largest parameter set nonzero} @!check_sum_specified:boolean; {did the user name the check sum?} @!bchar:0..256; {the right boundary character, or 256 if unspecified} @!vf:array[0..vf_size] of byte; {stored bytes for \.{VF} file} @!vf_ptr:0..vf_size; {first unused location in |vf|} @!vtitle_start:0..vf_size; {starting location of \.{VTITLE} string} @!vtitle_length:byte; {length of \.{VTITLE} string} @!fname_start:0..vf_size; {starting location of the local font name string} @!fname_length:byte; {length of the local font name tring} @!packet_start:array[byte] of 0..vf_size; {beginning location of character packet} @!packet_length:array[byte] of integer; {length of character packet} @ Parent font parameters are accessed by and stored in what follows here. @d font_checksum==fix_tfm(T_check_sum) {local font checksum} @d font_number=0 {local font id number} @d font_at==design_units {local font ``at size''} @d font_dsize==fix_tfm(T_design_size) {local font design size} @= @!farea_start: 0..vf_size; {beginning of local font area} @!farea_length: byte; {length of local font area} @ @= @!header_index=0..max_header_bytes; @!indx=0..@'77777; @ @= @!d:header_index; {an index into |header_bytes|} @ We start by setting up the default values. @d check_sum_loc=0 @d design_size_loc=4 @d coding_scheme_loc=8 @d family_loc=coding_scheme_loc+40 @d seven_flag_loc=family_loc+20 @d face_loc=seven_flag_loc+3 @= for d:=0 to 18*4-1 do header_bytes[d]:=0; header_bytes[8]:=11; header_bytes[9]:="T"; header_bytes[10]:="e"; header_bytes[11]:="X"; header_bytes[12]:=" "; header_bytes[13]:="t"; header_bytes[14]:="e"; header_bytes[15]:="x"; header_bytes[16]:="t"; header_bytes[17]:=" "; header_bytes[18]:="A"; header_bytes[19]:="C"; header_bytes[20]:="C"; header_bytes[21]:="E"; header_bytes[22]:="N"; header_bytes[23]:="T"; header_bytes[24]:="S"; @.KOI8-CS@> for d:=family_loc to family_loc+11 do header_bytes[d]:=header_bytes[d-40]; design_units:=unity; seven_bit_safe_flag:=false;@/ header_ptr:=18*4; nl:=0; min_nl:=0; nk:=0; ne:=0; np:=0;@/ check_sum_specified:=false; bchar:=256;@/ vf_ptr:=0; vtitle_start:=0; vtitle_length:=0; for k:=0 to 255 do packet_start[k]:=vf_size; for k:=0 to 127 do packet_length[k]:=1; for k:=128 to 255 do packet_length[k]:=2; @ Most of the dimensions, however, go into the |memory| array. There are at most 257 widths, 257 heights, 257 depths, and 257 italic corrections, since the value 0 is required but it need not be used. So |memory| has room for 1028 entries, each of which is a |fix_word|. An auxiliary table called |link| is used to link these words together in linear lists, so that sorting and other operations can be done conveniently. We also add four ``list head'' words to the |memory| and |link| arrays; these are in locations |width| through |italic|, i.e., 1 through 4. For example, |link[height]| points to the smallest element in the sorted list of distinct heights that have appeared so far, and |memory[height]| is the number of distinct heights. @d width=1 @d height=2 @d depth=3 @d italic=4 @d mem_size=1028+4 {number of nonzero memory addresses} @= @!pointer=0..mem_size; {an index into memory} @ The arrays |char_wd|, |char_ht|, |char_dp|, and |char_ic| contain pointers to the |memory| array entries where the corresponding dimensions appear. Two other arrays, |char_tag| and |char_remainder|, hold the other information that \.{TFM} files pack into a |char_info_word|. @d bchar_label==char_remainder[256] {beginning of ligature program for left boundary} @= @!memory:array[pointer] of fix_word; {character dimensions and kerns} @!mem_ptr:pointer; {largest |memory| word in use} @!link:array[pointer] of pointer; {to make lists of |memory| items} @!char_wd:array[byte] of pointer; {pointers to the widths} @!char_ht:array[byte] of pointer; {pointers to the heights} @!char_dp:array[byte] of pointer; {pointers to the depths} @!char_ic:array[byte] of pointer; {pointers to italic corrections} @!char_tag:array[byte] of no_tag..ext_tag; {character tags} @!char_remainder:array[0..256] of 0..65535; {pointers to ligature labels, next larger characters, or extensible characters} @ @= @!c:byte; {runs through all character codes} @ @= bchar_label:=@'77777; for c:=0 to 255 do begin char_wd[c]:=0; char_ht[c]:=0; char_dp[c]:=0; char_ic[c]:=0;@/ char_tag[c]:=no_tag; char_remainder[c]:=0; end; memory[0]:=@'17777777777; {an ``infinite'' element at the end of the lists} memory[width]:=0; link[width]:=0; {width list is empty} memory[height]:=0; link[height]:=0; {height list is empty} memory[depth]:=0; link[depth]:=0; {depth list is empty} memory[italic]:=0; link[italic]:=0; {italic list is empty} mem_ptr:=italic; @ As an example of these data structures, let us consider the simple routine that inserts a potentially new element into one of the dimension lists. The first parameter indicates the list head (i.e., |h=width| for the width list, etc.); the second parameter is the value that is to be inserted into the list if it is not already present. The procedure returns the value of the location where the dimension appears in |memory|. The fact that |memory[0]| is larger than any legal dimension makes the algorithm particularly short. We do have to handle two somewhat subtle situations. A width of zero must be put into the list, so that a zero-width character in the font will not appear to be nonexistent (i.e., so that its |char_wd| index will not be zero), but this does not need to be done for heights, depths, or italic corrections. Furthermore, it is necessary to test for memory overflow even though we have provided room for the maximum number of different dimensions in any legal font, since the \.{VPL} file might foolishly give any number of different sizes to the same character. @p function sort_in(@!h:pointer;@!d:fix_word):pointer; {inserts into list} var p:pointer; {the current node of interest} begin if (d=0)and(h<>width) then sort_in:=0 else begin p:=h; while d>=memory[link[p]] do p:=link[p]; if (d=memory[p])and(p<>h) then sort_in:=p else if mem_ptr=mem_size then begin message('Memory overflow: more than 1028 widths, etc'); @.Memory overflow...@> message('Congratulations! It''s hard to make this error.'); sort_in:=p; end else begin incr(mem_ptr); memory[mem_ptr]:=d; link[mem_ptr]:=link[p]; link[p]:=mem_ptr; incr(memory[h]); sort_in:=mem_ptr; end; end; end; @ When these lists of dimensions are eventually written to the \.{TFM} file, we may have to do some rounding of values, because the \.{TFM} file allows at most 256 widths, 16 heights, 16 depths, and 64 italic corrections. The following procedure takes a given list head |h| and a given dimension |d|, and returns the minimum $m$ such that the elements of the list can be covered by $m$ intervals of width $d$. It also sets |next_d| to the smallest value $d^\prime>d$ such that the covering found by this procedure would be different. In particular, if $d=0$ it computes the number of elements of the list, and sets |next_d| to the smallest distance between two list elements. (The covering by intervals of width |next_d| is not guaranteed to have fewer than $m$ elements, but in practice this seems to happen most of the time.) @= @!next_d:fix_word; {the next larger interval that is worth trying} @ Once again we can make good use of the fact that |memory[0]| is ``infinite.'' @p function min_cover(@!h:pointer;@!d:fix_word):integer; var p:pointer; {the current node of interest} @!l:fix_word; {the least element covered by the current interval} @!m:integer; {the current size of the cover being generated} begin m:=0; p:=link[h]; next_d:=memory[0]; while p<>0 do begin incr(m); l:=memory[p]; while memory[link[p]]<=l+d do p:=link[p]; p:=link[p]; if memory[p]-lm then begin excess:=memory[h]-m; k:=min_cover(h,0); d:=next_d; {now the answer is at least |d|} repeat d:=d+d; k:=min_cover(h,d); until k<=m; {first we ascend rapidly until finding the range} d:=d div 2; k:=min_cover(h,d); {now we run through the feasible steps} while k>m do begin d:=next_d; k:=min_cover(h,d); end; shorten:=d; end else shorten:=0; end; @ When we are nearly ready to output the \.{TFM} file, we will set |index[p]:=k| if the dimension in |memory[p]| is being rounded to the |k|th element of its list. @= @!index:array[pointer] of byte; @!excess:byte; {number of words to remove, if list is being shortened} @ Here is the procedure that sets the |index| values. It also shortens the list so that there is only one element per covering interval; the remaining elements are the midpoints of their clusters. @p procedure set_indices(@!h:pointer;@!d:fix_word); {reduces and indexes a list} var p:pointer; {the current node of interest} @!q:pointer; {trails one step behind |p|} @!m:byte; {index number of nodes in the current interval} @!l:fix_word; {least value in the current interval} begin q:=h; p:=link[q]; m:=0; while p<>0 do begin incr(m); l:=memory[p]; index[p]:=m; while memory[link[p]]<=l+d do begin p:=link[p]; index[p]:=m; decr(excess); if excess=0 then d:=0; end; link[q]:=p; memory[p]:=l+(memory[p]-l) div 2; q:=p; p:=link[p]; end; memory[h]:=m; end; @* Link source characters with the accented ones. These links are stored in the |char_links| array in such a way, that if there's another character linked with |c|, it's value is stored in |char_links[c]|, etc. A parallel array |link_continues| contains the indication, whether the chain continues further. These lists are always passed through starting from an unaccented character which, therefore, presents the only one member of the chain from which all the rest must be accesible. Whenever a character |c| appears as a value in |char_links|, there is a recipe for generating the corresponding accented version from the base character. The type of the modification involved can be deduced from |char_modification(c)| value (plus the original character, of course). @d grave_accent=@"12 @d acute_accent=@"13 @d circumflex=@"5E @d umlaut=@"7F @d tilde=@"7E @d macron=@"16 @d dot_accent=@"5F @d breve_accent=@"15 @d hachek=@"14 @d long_umlaut=@"7D @d circle_accent=@'27 @d cedilla_accent=@'30 @d dot_under="." @d bar_under=254 @d ogonek="`" @d hachek_after=255 @d suppress=@'40 @d goes_above==grave_accent, acute_accent, circumflex, umlaut, tilde, macron, dot_accent, breve_accent, hachek, long_umlaut, circle_accent @d ADOBE_ogonek=@'316 @ @= @!char_links: array[byte] of byte; {the link array} @!link_continues: array[byte] of boolean; {link continuation indication} @!not_conflict: array[byte] of boolean; {conflict characters masking out} @!char_modifications: array[byte] of byte; {modification recipe} @!proceed:boolean; {chain trigger} @ Here we define the appropriate combinations of accented characters. @d db1(#)==char_modifications[k]:=#; @d da1(#)==char_links[k]:=#; link_continues[k]:=true; k:=#;db1 @d dc1(#)==k:=#;da1 @d db2(#)==char_modifications[k]:=#;da1 @d da2(#)==char_links[k]:=#; link_continues[k]:=true; k:=#;db2 @d dc2(#)==k:=#;da2 @d db3(#)==char_modifications[k]:=#;da2 @d da3(#)==char_links[k]:=#; link_continues[k]:=true; k:=#;db3 @d dc3(#)==k:=#;da3 @d db4(#)==char_modifications[k]:=#;da3 @d da4(#)==char_links[k]:=#; link_continues[k]:=true; k:=#;db4 @d dc4(#)==k:=#;da4 @d db5(#)==char_modifications[k]:=#;da4 @d da5(#)==char_links[k]:=#; link_continues[k]:=true; k:=#;db5 @d dc5(#)==k:=#;da5 @d db6(#)==char_modifications[k]:=#;da5 @d da6(#)==char_links[k]:=#; link_continues[k]:=true; k:=#;db6 @d dc6(#)==k:=#;da6 @d db7(#)==char_modifications[k]:=#;da6 @d da7(#)==char_links[k]:=#; link_continues[k]:=true; k:=#;db7 @d dc7(#)==k:=#;da7 @d db8(#)==char_modifications[k]:=#;da7 @d da8(#)==char_links[k]:=#; link_continues[k]:=true; k:=#;db8 @d dc8(#)==k:=#;da8 @d db9(#)==char_modifications[k]:=#;da8 @d da9(#)==char_links[k]:=#; link_continues[k]:=true; k:=#;db9 @d dc9(#)==k:=#;da9 @d db10(#)==char_modifications[k]:=#;da9 @d da10(#)==char_links[k]:=#; link_continues[k]:=true; k:=#;db10 @d dc10(#)==k:=#;da10 @d db11(#)==char_modifications[k]:=#;da10 @d da11(#)==char_links[k]:=#; link_continues[k]:=true; k:=#;db11 @d dc11(#)==k:=#;da11 @= for k:=0 to 255 do begin char_links[k]:=0; link_continues[k]:=false; not_conflict[k]:=true end;@/ dc8("A")(@'200)(breve_accent)(@'201)(ogonek)(@'300)(grave_accent) (@'301)(acute_accent)(@'302)(circumflex)(@'303)(tilde) (@'304)(umlaut)(@'305)(circle_accent)@/ dc3("C")(@'202)(acute_accent)(@'203)(hachek) (@'307)(cedilla_accent)@/ dc1("D")(@'204)(hachek)@/ dc6("E")(@'205)(hachek)(@'206)(ogonek)(@'310)(grave_accent) (@'311)(acute_accent)(@'312)(circumflex)(@'313)(umlaut)@/ dc1("G")(@'207)(breve_accent) dc5("I")(@'235)(dot_accent)(@'314)(grave_accent) (@'315)(acute_accent)(@'316)(circumflex)(@'317)(umlaut)@/ dc3("L")(@'210)(acute_accent)(@'211)(hachek_after) (@'212)(suppress)@/ dc3("N")(@'213)(acute_accent)(@'214)(hachek)(@'321)(tilde)@/ dc6("O")(@'216)(long_umlaut)(@'322)(grave_accent)(@'323)(acute_accent) (@'324)(circumflex)(@'325)(tilde)(@'326)(umlaut)@/ dc2("R")(@'217)(acute_accent)(@'220)(hachek)@/ dc3("S")(@'221)(acute_accent)(@'222)(hachek)(@'223)(cedilla_accent)@/ dc2("T")(@'224)(hachek)(@'225)(cedilla_accent)@/ dc6("U")(@'226)(long_umlaut)(@'227)(circle_accent) (@'331)(grave_accent)(@'332)(acute_accent)(@'333)(circumflex) (@'334)(umlaut)@/ dc2("Y")(@'230)(umlaut)(@'335)(acute_accent)@/ dc3("Z")(@'231)(acute_accent)(@'232)(hachek)(@'233)(dot_accent)@/ dc8("a")(@'240)(breve_accent)(@'241)(ogonek)(@'340)(grave_accent) (@'341)(acute_accent)(@'342)(circumflex)(@'343)(tilde) (@'344)(umlaut)(@'345)(circle_accent)@/ dc3("c")(@'242)(acute_accent)(@'243)(hachek) (@'347)(cedilla_accent)@/ dc1("d")(@'244)(hachek_after)@/ dc6("e")(@'245)(hachek)(@'246)(ogonek)(@'350)(grave_accent) (@'351)(acute_accent)(@'352)(circumflex)(@'353)(umlaut)@/ dc1("g")(@'247)(breve_accent) dc4("i")(@'354)(grave_accent) (@'355)(acute_accent)(@'356)(circumflex)(@'357)(umlaut)@/ dc3("l")(@'250)(acute_accent)(@'251)(hachek_after) (@'252)(suppress)@/ dc3("n")(@'253)(acute_accent)(@'254)(hachek)(@'361)(tilde)@/ dc6("o")(@'256)(long_umlaut)(@'362)(grave_accent)(@'363)(acute_accent) (@'364)(circumflex)(@'365)(tilde)(@'366)(umlaut)@/ dc2("r")(@'257)(acute_accent)(@'260)(hachek)@/ dc3("s")(@'261)(acute_accent)(@'262)(hachek)(@'263)(cedilla_accent)@/ dc2("t")(@'264)(hachek_after)(@'265)(cedilla_accent)@/ dc6("u")(@'266)(long_umlaut)(@'267)(circle_accent) (@'371)(grave_accent)(@'372)(acute_accent)(@'373)(circumflex) (@'374)(umlaut)@/ dc2("y")(@'270)(umlaut)(@'375)(acute_accent)@/ dc3("z")(@'271)(acute_accent)(@'272)(hachek)(@'273)(dot_accent)@/ @ When user-defined font layout is used, the following code provides the initialization. This module serves just as an example: The user may supply their own national code layout. (This example contains ``KOI8-CS'' version used by the author in Czechoslovakia.) The user-defined font layout is selected by saying (USERSCHEME) in the auxiliary adjustment input. @= begin header_bytes[18]:="K"; header_bytes[19]:="O"; header_bytes[20]:="I"; header_bytes[21]:="8"; header_bytes[22]:="-"; header_bytes[23]:="C"; header_bytes[24]:="S"; @.ACCENTS@> for k:=0 to 255 do begin char_links[k]:=0; link_continues[k]:=false; not_conflict[k]:=true end;@/ dc3("A")(@'341)(acute_accent)(@'361)(umlaut)(@'370)(grave_accent)@/ dc1("C")(@'343)(hachek)@/ dc1("D")(@'344)(hachek)@/ dc2("E")(@'345)(hachek)(@'367)(acute_accent)@/ dc1("I")(@'351)(acute_accent)@/ dc2("L")(@'353)(acute_accent)(@'354)(hachek_after)@/ dc1("N")(@'356)(hachek)@/ dc3("O")(@'357)(acute_accent)(@'355)(umlaut)(@'360)(circumflex)@/ dc2("R")(@'362)(hachek)(@'346)(acute_accent)@/ dc1("S")(@'363)(hachek)@/ dc1("T")(@'364)(hachek)@/ dc3("U")(@'352)(circle_accent)(@'350)(umlaut)(@'365)(acute_accent)@/ dc1("Y")(@'371)(acute_accent)@/ dc1("Z")(@'372)(hachek)@/ dc3("a")(@'301)(acute_accent)(@'321)(umlaut)(@'330)(grave_accent)@/ dc1("c")(@'303)(hachek)@/ dc1("d")(@'304)(hachek_after)@/ dc2("e")(@'305)(hachek)(@'327)(acute_accent)@/ dc1("i")(@'311)(acute_accent)@/ dc2("l")(@'313)(acute_accent)(@'314)(hachek_after)@/ dc1("n")(@'316)(hachek)@/ dc3("o")(@'317)(acute_accent)(@'315)(umlaut)(@'320)(circumflex)@/ dc2("r")(@'322)(hachek)(@'306)(acute_accent)@/ dc1("s")(@'323)(hachek)@/ dc1("t")(@'324)(hachek_after)@/ dc3("u")(@'312)(circle_accent)(@'310)(umlaut)(@'325)(acute_accent)@/ dc1("y")(@'331)(acute_accent)@/ dc1("z")(@'332)(hachek)@/ end @ @d next_character== repeat proceed:=link_continues[temp_byte]; temp_byte:=char_links[temp_byte] until not_conflict[temp_byte] or (not proceed) {step on next character in the chain unless it's a conflict one} @* Source font encoding scheme. This version can recognize \.{ADOBE} font encoding scheme, and transform it into standard \TeX\ text font scheme. Some of the \.{ADOBE} font characters disappear in the course of this translation, they can nonetheless still be accessible from the parent font. When working decoding the font encoding scheme, for a character code |c| from the parent font, |valid_code(c)| is |true| iff there exists a code for |c| in the \TeX\ font encoding, with |decode[c]| being the corresponding \TeX\ character code; |encode| goes in the other direction (it is used when typesetting accents). @d TeX_encoding=0 @d ADOBE_encoding=1 @d max_encoding=1 {last encoding type used in this program} @= @!font_encoding:TeX_encoding..max_encoding; @!decode:array[byte]of byte; {decoding function} @!encode:array[byte]of byte; {encoding function} @!valid_code:array[byte]of boolean; {yields |true| iff the code is translatable to \TeX\ scheme} @!i_normal,@!i_dotless,@!j_normal,@!j_dotless:integer; @ \TeX\ font encoding is the default. @= font_encoding:=TeX_encoding; for k:=0 to 255 do begin valid_code[k]:=true; decode[k]:=k; encode[k]:=k; end; i_normal:=@'151;@/ i_dotless:=@'20;@/ j_normal:=@'152;@/ j_dotless:=@'21;@/ @ Non-\TeX\ encoding scheme recognition. If \.{TFM} which has been input contains indication of other than \TeX\ encoding known to this program, we set the |font_encoding| variable accordingly, and change |valid_code| and |decode| arrays. @d dee(#)==@'#; encode[@'#]:=k; @d de(#)==valid_code[@'#]:=true; k:=@'#; decode[@'#]:=dee @= if T_lh>1 then begin if (T_tfm[T_scheme]>=21) and@/ ((T_tfm[T_scheme+1]="A") or (T_tfm[T_scheme+1]="a")) and@/ ((T_tfm[T_scheme+2]="D") or (T_tfm[T_scheme+2]="d")) and@/ ((T_tfm[T_scheme+3]="O") or (T_tfm[T_scheme+3]="o")) and@/ ((T_tfm[T_scheme+4]="B") or (T_tfm[T_scheme+4]="b")) and@/ ((T_tfm[T_scheme+5]="E") or (T_tfm[T_scheme+5]="e")) and@/ ((T_tfm[T_scheme+6]="S") or (T_tfm[T_scheme+6]="s")) and@/ ((T_tfm[T_scheme+7]="T") or (T_tfm[T_scheme+7]="t")) and@/ ((T_tfm[T_scheme+8]="A") or (T_tfm[T_scheme+8]="a")) and@/ ((T_tfm[T_scheme+9]="N") or (T_tfm[T_scheme+9]="n")) and@/ ((T_tfm[T_scheme+10]="D") or (T_tfm[T_scheme+10]="d")) and@/ ((T_tfm[T_scheme+11]="A") or (T_tfm[T_scheme+11]="a")) and@/ ((T_tfm[T_scheme+12]="R") or (T_tfm[T_scheme+12]="r")) and@/ ((T_tfm[T_scheme+13]="D") or (T_tfm[T_scheme+13]="d")) and@/ ((T_tfm[T_scheme+14]="E") or (T_tfm[T_scheme+14]="e")) and@/ ((T_tfm[T_scheme+15]="N") or (T_tfm[T_scheme+15]="n")) and@/ ((T_tfm[T_scheme+16]="C") or (T_tfm[T_scheme+16]="c")) and@/ ((T_tfm[T_scheme+17]="O") or (T_tfm[T_scheme+17]="o")) and@/ ((T_tfm[T_scheme+18]="D") or (T_tfm[T_scheme+18]="d")) and@/ ((T_tfm[T_scheme+19]="I") or (T_tfm[T_scheme+19]="i")) and@/ ((T_tfm[T_scheme+20]="N") or (T_tfm[T_scheme+20]="n")) and@/ ((T_tfm[T_scheme+21]="G") or (T_tfm[T_scheme+21]="g")) then @ end; @ ADOBE font layout needs to define another translation table @= begin font_encoding:=ADOBE_encoding; message('Input TFM is ADOBE file encoding scheme.'); for k:=0 to "0"-1 do begin valid_code[k]:=false end; for k:="9"+1 to "A"-1 do begin valid_code[k]:=false end; for k:="Z"+1 to "a"-1 do begin valid_code[k]:=false end; for k:="z"+1 to 255 do begin valid_code[k]:=false end; de(256)(14) de(257)(15)@/ de(365)(20) de(301)(22) de(302)(23) de(317)(24) de(306)(25) de(305)(26) de(312)(27)@/ de(313)(30) de(373)(31) de(361)(32) de(372)(33) de(371)(34) de(341)(35) de(352)(36) de(351)(37)@/ de(41)(41) de(272)(42) de(43)(43) de(44)(44) de(45)(45) de(46)(46) de(47)(47)@/ de(50)(50) de(51)(51) de(52)(52) de(53)(53) de(54)(54) de(55)(55) de(56)(56) de(57)(57)@/ de(72)(72) de(73)(73) de(241)(74) de(75)(75) de(277)(76) de(77)(77)@/ de(100)(100)@/ de(133)(133) de(252)(134) de(135)(135) de(303)(136) de(307)(137)@/ de(140)(140)@/ de(261)(173) de(320)(174) de(315)(175) de(304)(176) de(310)(177)@/ i_normal:=@'151;@/ i_dotless:=@'365;@/ j_normal:=@'151; {dotless j doesn't exist in \.{ADOBE} fonts}@/ j_dotless:=@'365;@/ encode[hachek_after]:=hachek_after; end @ Accented characters may overlap some characters already present in the original font. If that happens, characters from the original font take precedence because related pieces of information may already be present in \.{LIGTABLE}. Here we find out whether such conflicts occur, and mask out possible conflicting characters. @= if font_encoding=TeX_encoding then {we only do this check for TeX fonts} for k:=0 to 255 do if link_continues[k] and (not (T_nonexistent(char_links[k]))) then begin not_conflict[char_links[k]]:=false; print('Character '); print_octal(char_links[k]); print_ln(' already present in the parent font;'); print_ln(' --- so I won''t generate it as an accented letter.'); end; @* Store virtual font characters. Once \.{TFM} file has been read in, we are ready to output virtual font definition. @ Font names are system dependent. They can usually be derived from the names of the input \.{TFM} file for the local font, and that of the output \.{VF} file for the virtual font itself. @= fname_start:=vf_ptr; for k:= 1 to T_tfm[T_family] do vf_store(T_tfm[T_family+k]); fname_length:=vf_ptr-fname_start; vtitle_start:=vf_ptr; for k:= 1 to T_tfm[T_family] do vf_store(T_tfm[T_family+k]); vtitle_length:=vf_ptr-vtitle_start; @ Some of the values will be copied verbatim from the information supplied by the \.{TFM} file. In particular, this is true for the basic design parameters of the font. @d design_size==fix_tfm(T_design_size) @ We keep track of whether or not any errors have had to be corrected. @= @!perfect:boolean; {was the file free of errors?} @!x:fix_word; {current dimension of interest} @!k:integer; {general-purpose index} @!default_char:byte; {default character for error correction} @ @= perfect:=true; {innocent until proved guilty} @ Error messages are given with the help of the |bad| and |range_error| and |bad_char| macros: @d bad(#)==begin perfect:=false; message('Bad TFM file: ',#); end @.Bad TFM file@> @d range_error(#)==begin perfect:=false; print(#,' index for character '); print_octal(c); print_ln(' is too large;'); print_ln('so I reset it to zero.'); end @d bad_char_tail(#)==print_octal(#); print_ln('.'); end @d bad_char(#)==begin perfect:=false; if chars_on_line>0 then print_ln(' '); chars_on_line:=0; print('Bad TFM file: ',#,' nonexistent character '); bad_char_tail @d correct_bad_char_tail(#)==print_octal(T_tfm[#]); print_ln('.'); T_tfm[#]:=default_char; end @d correct_bad_char(#)== begin perfect:=false; if chars_on_line>0 then print_ln(' '); chars_on_line:=0; print('Bad TFM file: ',#,' nonexistent character '); correct_bad_char_tail @ @= @! r:0..65535; {a random two-byte value} @! c:0..256; {a random character} @! i:0..@'77777; {an index to words of a subfile} @! d:0..3; {byte number in a word} @! temp_byte, temp_b1, temp_b2: byte; @! temp_fix:fix_word; @ @= seven_bit_safe_flag:=false; @ Copy font parameters (\.{FONTDIMEN}) @= for k:=1 to T_np do param[k]:=fix_tfm(T_param(k)); np:=T_np; @ @= for k:=0 to 7 do header_bytes[k]:=T_tfm[T_check_sum+k]; for k :=0 to 19 do header_bytes[family_loc+k]:=T_tfm[T_family+k]; @ Local font area parameter is reset to empty. @= farea_start:=vf_size; farea_length:=0; @ The ligature/kerning output comes next. The following code is adapted from \.{VFtoVP}; the same checks on ligature/kerning program are performed. Before we can put it out, we need to make a table of ``labels'' that will be inserted into the program. For each character |c| whose |tag| is |lig_tag| and whose starting address is |r|, we will store the pair |(c,r)| in the |label_table| array. If there's a boundary-char program starting at~|r|, we also store the pair |(256,r)|. This array is sorted by its second components, using the simple method of straight insertion. @= @!label_table:array[0..258] of record@t@>@/@!cc:0..256;@!rr:0..T_lig_size;end; @!label_ptr: 0..257; {the largest entry in |label_table|} @!sort_ptr:0..257; {index into |label_table|} @!boundary_char:0..256; {boundary character, or 256 if none} @!T_bchar_label:0..@'77777; {beginning of boundary character program} @ @= boundary_char:=256; T_bchar_label:=@'77777;@/ label_ptr:=0; label_table[0].rr:=0; {a sentinel appears at the bottom} @ We'll also identify and remove inaccessible program steps, using the |activity| array. @d unreachable=0 {a program step not known to be reachable} @d pass_through=1 {a program step passed through on initialization} @d accessible=2 {a program step that can be relevant} @= @!activity:array[0..T_lig_size] of unreachable..accessible; @!ai,@!acti:0..T_lig_size; {indices into |activity|} @ @= if T_nl>0 then begin for ai:=0 to T_nl-1 do activity[ai]:=unreachable; @; end; @; if T_nl>0 then begin @; @; end @ We build the label table even when |nl=0|, because this catches errors that would not otherwise be detected. @= for c:=T_bc to T_ec do if T_tag(c)=lig_tag then begin r:=T_remainder(c); if rstop_flag then begin r:=256*T_tfm[T_lig_step(r)+2]+T_tfm[T_lig_step(r)+3]; if r=T_nl then begin perfect:=false; if chars_on_line>0 then print_ln(' '); chars_on_line:=0; print('Ligature/kern starting index for character '); print_octal(c); print_ln(' is too large;'); print_ln('so I removed it.'); T_reset_tag(c); @.Ligature/kern starting index...@> end else if valid_code[c] then begin temp_byte:=decode[c]; proceed:=true; while proceed do begin @; next_character; end end end; label_table[label_ptr+1].rr:=T_lig_size; {put ``infinite'' sentinel at the end} @ @= begin sort_ptr:=label_ptr; {there's a hole at position |sort_ptr+1|} while label_table[sort_ptr].rr>r do begin label_table[sort_ptr+1]:=label_table[sort_ptr]; decr(sort_ptr); {move the hole} end; label_table[sort_ptr+1].cc:=temp_byte; label_table[sort_ptr+1].rr:=r; {fill the hole} incr(label_ptr); activity[r]:=accessible; end @ @= if T_tfm[T_lig_step(0)]=255 then {\.{BOUNDARYCHAR}} begin boundary_char:=T_tfm[T_lig_step(0)+1]; bchar:=decode[boundary_char]; activity[0]:=pass_through; end; if T_tfm[T_lig_step(T_nl-1)]=255 then begin r:=256*T_tfm[T_lig_step(T_nl-1)+2]+T_tfm[T_lig_step(T_nl-1)+3]; if r>=T_nl then begin perfect:=false; if chars_on_line>0 then print_ln(' '); chars_on_line:=0; print('Ligature/kern starting index for boundarychar is too large;'); print_ln('so I removed it.'); @.Ligature/kern starting index...@> end else begin label_ptr:=1; label_table[1].cc:=256; label_table[1].rr:=r; T_bchar_label:=r; activity[r]:=accessible; end; activity[T_nl-1]:=pass_through; end @ @= for ai:=0 to T_nl-1 do if activity[ai]=accessible then begin r:=T_tfm[T_lig_step(ai)]; if r=T_nl then begin bad('Ligature/kern step ',ai:1,' skips too far;'); @.Lig...skips too far@> print_ln('I made it stop.'); T_tfm[T_lig_step(ai)]:=stop_flag; end else activity[r]:=accessible; end; end @ We ignore |pass_through| items, which would not be mentioned in the \.{VPL} file. @= sort_ptr:=1; {point to the next label that will be needed} @ for acti:=0 to T_nl-1 do if activity[acti]<>pass_through then begin i:=acti; @; @; end; @ @= default_char:=T_bc; while (not valid_code[default_char] or T_nonexistent(default_char)) and (default_char= while i=label_table[sort_ptr].rr do begin {\.{LABEL}} if label_table[sort_ptr].cc=256 then {\.{BOUNDARYCHAR}} bchar_label:=nl else begin c:=label_table[sort_ptr].cc; char_tag[c]:=lig_tag; char_remainder[c]:=nl end; lk_step_ended:=false; incr(sort_ptr); end @ @= @!lk_step_ended: boolean; {was the last \.{LIGTABLE} property \.{LIG} or \.{KRN}?} @!krn_ptr: 0..max_kerns; {an index into |kern|} @!count:integer; {register for simple calculations} @ @= begin k:=T_lig_step(i); if T_tfm[k]>stop_flag then begin if 256*T_tfm[k+2]+T_tfm[k+3]>=T_nl then bad('Ligature unconditional stop command address is too big.'); @.Ligature unconditional stop...@> end else if T_tfm[k+2]>=kern_flag then @ else @; if T_tfm[k]>0 then if lk_step_ended then @; end @ The \.{SKIP} command is a bit tricky, because we will be omitting all inaccessible commands. @= begin if T_tfm[k]>=stop_flag then {\.{STOP}} begin lig_kern[nl-1].b0:=stop_flag; lk_step_ended:=false end else begin count:=0; for ai:=i+1 to i+T_tfm[k] do if activity[ai]=accessible then incr(count); {\.{SKIP} count} if count>=128 then message('Skip amount larger than 127'); if nl+count>=max_lig_steps then message('Sorry, LIGTABLE too long for me to handle') else begin lig_kern[nl-1].b0:=count; if min_nl<=nl+count then min_nl:=nl+count+1; end; {possibly $count=0$, so who cares} lk_step_ended:=false end end @ @= begin lig_kern[nl].b0:=0; lig_kern[nl].b1:=temp_byte; kern[nk]:=temp_fix; krn_ptr:=0; while kern[krn_ptr]<>kern[nk] do incr(krn_ptr); if krn_ptr=nk then begin if nk decr(krn_ptr); end; end; lig_kern[nl].b2:=kern_flag+(krn_ptr div 256); lig_kern[nl].b3:=krn_ptr mod 256; if nl>=max_lig_steps-1 then message('Sorry, LIGTABLE too long for me to handle') @.Sorry, LIGTABLE too long...@> else incr(nl); lk_step_ended:=true; end @ For invalid characters from the parent font with non-\TeX\ font encoding scheme the kern steps are skipped. @= begin if valid_code[T_tfm[k+1]] then@/ begin if T_nonexistent(T_tfm[k+1]) then if decode[T_tfm[k+1]]<>boundary_char then correct_bad_char('Kern step for')(k+1); @.Kern step for nonexistent...@> temp_byte:=decode[T_tfm[k+1]]; r:=256*(T_tfm[k+2]-kern_flag)+T_tfm[k+3]; if r>=T_nk then begin bad('Kern index too large.'); @.Kern index too large@> temp_fix:=0; end else temp_fix:=fix_tfm(T_kern(r)); proceed:=true; while proceed do begin @; next_character; end end else lk_step_ended:=true; end @ @= begin lig_kern[nl].b0:=0; lig_kern[nl].b2:=r; lig_kern[nl].b1:=temp_byte; lig_kern[nl].b3:=temp_b1; if nl>=max_lig_steps-1 then message('Sorry, LIGTABLE too long for me to handle') @.Sorry, LIGTABLE too long...@> else incr(nl); lk_step_ended:=true; end @ Ligatures involving invalid characters from the parent font with non-\TeX\ font encoding scheme are treated as an error. @= begin if valid_code[T_tfm[k+1]] then begin if T_nonexistent(T_tfm[k+1]) or (not valid_code[T_tfm[k+1]]) then if decode[T_tfm[k+1]]<>boundary_char then correct_bad_char('Ligature step for')(k+1); @.Ligature step for nonexistent...@> if T_nonexistent(T_tfm[k+3]) or (not valid_code[T_tfm[k+3]]) then correct_bad_char('Ligature step produces the')(k+3); @.Ligature step produces...@> r:=T_tfm[k+2]; if (r=4)or((r>7)and(r<>11)) then begin message('Ligature step with nonstandard code changed to LIG'); r:=0; T_tfm[k+2]:=0; end; {\.{LIG}} temp_byte:=decode[T_tfm[k+1]]; temp_b1:=decode[T_tfm[k+3]]; @ end else lk_step_ended:=true; end @ The last thing on \.{ACCENTS}'s agenda is to go through the list of |char_info| and spew out the information about each individual character. @= for c:=T_bc to T_ec do if (T_width_index(c)>0) and valid_code[c] then begin temp_byte:=decode[c]; proceed:=true; while proceed do begin if chars_on_line=8 then begin print_ln(' '); chars_on_line:=1; end else begin if chars_on_line>0 then print(' '); incr(chars_on_line); end; print_octal(temp_byte); {progress report} {\.{CHARACTER}} @; if T_height_index(c)>0 then @; if T_depth_index(c)>0 then @; if T_italic_index(c)>0 then @; if temp_byte=decode[c] then @ else @; next_character; end; case T_tag(c) of no_tag: do_nothing; list_tag: @; ext_tag: @; end@/ end @ @= begin temp_fix:=0; @ if T_width_index(c)>=T_nw then range_error('Width') else char_wd[temp_byte]:=sort_in(width,fix_tfm(T_width(c))+temp_fix); end @ Character height of the generated character is extended by the actual height of the accent. @d x_height==fix_tfm(T_param(5)) {|x_height| of our font} @= begin temp_fix:=0; {HEIGHT} @ if T_height_index(c)>=T_nh then range_error('Height') @.Height index for char...@> else char_ht[temp_byte]:=sort_in(height,fix_tfm(T_height(c))+temp_fix) end @ @= begin temp_fix:=0; {HEIGHT} @ if T_depth_index(c)>=T_nd then range_error('Depth') @.Depth index for char@> else char_dp[temp_byte]:=sort_in(depth,fix_tfm(T_depth(c))+temp_fix) end @ @= if T_italic_index(c)>=T_ni then range_error('Italic correction') @.Italic correction index for char...@> else char_ic[temp_byte]:=sort_in(italic,fix_tfm(T_italic(c))) @ We want to make sure that there is no cycle of characters linked together by |list_tag| entries, since such a cycle would get \TeX\ into an endless loop. If such a cycle exists, the routine here detects it when processing the largest character code in the cycle. @= begin r:=T_remainder(c); if T_nonexistent(r) then begin bad_char('Character list link to')(r); T_reset_tag(c); @.Character list link...@> end else begin while (decode[r] print('Character '); print_octal(c); print_ln(' now ends the list.'); T_reset_tag(c); end else begin char_tag[decode[c]]:=list_tag; char_remainder[decode[c]]:=decode[T_remainder(c)]; end; end; end @ @= if T_remainder(c)>=T_ne then begin range_error('Extensible'); T_reset_tag(c); @.Extensible index for char@> end else begin @ end @ @= for k:=0 to 3 do if (k=3)or(T_tfm[T_exten(c)+k]>0) then begin if T_nonexistent(T_tfm[T_exten(c)+k]) or (not valid_code[T_tfm[T_exten(c)+k]]) then temp_byte:=decode[c] else temp_byte:=decode[T_tfm[T_exten(c)+k]]; case k of 0:exten[ne].b0:=temp_byte;@+1:exten[ne].b1:=temp_byte;@+ 2:exten[ne].b2:=temp_byte;@+3:exten[ne].b3:=temp_byte@+end; end @ Some of the extensible recipes may not actually be used, but \TeX\ will complain about them anyway if they refer to nonexistent characters. Therefore \.{ACCENTS} must check them too. @= if T_ne>0 then for c:=0 to T_ne-1 do for d:=0 to 3 do begin k:=4*(T_exten_base+c)+d; if (T_tfm[k]>0)or(d=3) then begin if T_nonexistent(T_tfm[k]) or (not valid_code[T_tfm[k]]) then begin bad_char('Extensible recipe involves the')(T_tfm[k]); @.Extensible recipe involves...@> if d<3 then T_tfm[k]:=0; end; end; end @ Just before each \.{CHARACTER} property list is generated, the character code is printed in octal notation. Up to eight such codes appear on a line; so we have a variable to keep track of how many are currently there. @= @!chars_on_line:0..8; {the number of characters printed on the current line} @ @= chars_on_line:=0; @ @p procedure print_octal(c:byte); {prints three octal digits} begin print('''',(c div 64):1,((c div 8) mod 8):1,(c mod 8):1); end; @* Generate the mappings. Each \.{MAP} property is a sequence of \.{DVI} instructions, for which we need to know some of the opcodes. @d set_char_0=0 {\.{DVI} command to typeset character 0 and move right} @d set1=128 {typeset a character and move right} @d set_rule=132 {typeset a rule and move right} @d push=141 {save the current positions} @d pop=142 {restore previous positions} @d right1=143 {move right} @d w0=147 {move right by |w|} @d w1=148 {move right and set |w|} @d x0=152 {move right by |x|} @d x1=153 {move right and set |x|} @d down1=157 {move down} @d y0=161 {move down by |y|} @d y1=162 {move down and set |y|} @d z0=166 {move down by |z|} @d z1=167 {move down and set |z|} @d fnt_num_0=171 {set current font to 0} @d fnt1=235 {set current font} @d xxx1=239 {extension to \.{DVI} primitives} @d xxx4=242 {potentially long extension to \.{DVI} primitives} @d fnt_def1=243 {define the meaning of a font number} @d pre=247 {preamble} @d post=248 {postamble beginning} @ All the parent file characters have to be set without any change. @= begin packet_start[decode[c]]:=vf_ptr; if c>=128 then vf_store(set1);@+ vf_store(c); packet_length[decode[c]]:=vf_ptr-packet_start[decode[c]]; end @ Accented versions of the source characters are being set as simple \.{DVI} programs. @= begin packet_start[temp_byte]:=vf_ptr; vf_store(push); case char_modifications[temp_byte] of goes_above: make_accent(encode[char_modifications[temp_byte]],c,temp_byte); hachek_after: make_hachek_after(c,temp_byte); cedilla_accent, dot_under: make_under(char_modifications[temp_byte], encode[char_modifications[temp_byte]],c,temp_byte); ogonek: if font_encoding=ADOBE_encoding then make_under(char_modifications[temp_byte], ADOBE_ogonek,c,temp_byte) else make_under(char_modifications[temp_byte], encode[char_modifications[temp_byte]],c,temp_byte); bar_under: make_under(char_modifications[temp_byte], encode[macron],c,temp_byte); suppress: make_suppressed(c,temp_byte); othercases {set |c| without any change} begin if c>=128 then vf_store(set1); @+ vf_store(c); end; endcases; vf_store(pop); packet_length[temp_byte]:=vf_ptr-packet_start[temp_byte]; end @ The positioning of accents is straightforward but tedious, and was taken from sections 1123--1125 from \TeX82 program. Given an accent of width |a|, designed for characters of height |x| and slant |s|; and given a character of width |w|, height |h|, and slant |t|: We will shift the accent down by $\gamma=x-h$, and we will insert kern nodes that have the effect of centering the accent over the character and shifting the accent to the right by $\delta={1\over2}(w-a)+h\cdot t-x\cdot s$. If either character is absent from the font, we will simply use the other, without shifting. @d slant==fix_tfm(T_param(1)) {|slant| of our font} @d float_constant(#)==#.0 {convert |integer| constant to |real|} @p procedure make_accent(@!accent,@!character,@!ch: byte); var s,@!t: real; {amount of slant} @!a,@!h,@!x,@!w,@!delta,@!gamma:fix_word; {heights and widths, as explained above} @!c:byte; {dotless character} begin c:=character; if c=i_normal then c:=i_dotless else if c=j_normal then c:=j_dotless;@/ x:=x_height; s:=slant/float_constant(1048576); @^real division@> a:=fix_tfm(T_width(accent));@/ @; end; @ The two kerns are computed with (machine-dependent) |real| arithmetic, but their sum is machine-independent; the net effect is machine-independent. @= begin t:=slant/float_constant(1048576); @^real division@> w:=fix_tfm(T_width(c)); h:=fix_tfm(T_height(c)); gamma:=x-h; {amount by which the accent must be shifted up or down} delta:=round((w-a)/float_constant(2)+h*t-x*s);@/ gamma:=gamma+vadjust[ch]; delta:=delta+hadjust[ch]; @^real multiplication@> @^real addition@> vf_store(push);@/ if delta<>0 then vf_fix(right1,delta);@/ if gamma<>0 then vf_fix(down1,gamma);@/ if accent>=128 then vf_store(set1);@+ vf_store(accent);@/vf_store(pop);@/ if c>=128 then vf_store(set1);@+ vf_store(c);@/ end @ ``Hachek after'' is an ordinary apostrophe placed immediately after the accented character. Before apostrophe placing we make |0.1*design_size| kern back (this is just an ad-hock distance to make those two closer together), and the apostrophe itself doesn't have any width. This doesn't place the apostrophe correctly after capital~|"L"|, but we leave this on the user. We first place the apostrophe and then backup in order to prevent \.{DVICOPY} to insert negative horizontal move after virtual character processing. @p procedure make_hachek_after(@!c,@!ch: byte); var @!h,@!x,@!delta,@!gamma:fix_word; {heights and widths} @! a:byte; begin a:=encode[@'047]; h:=fix_tfm(T_height(a)); {height of apostrophe}@/ x:=fix_tfm(T_height(c)); gamma:=h-x; {amount by which the accent must be shifted up or down} delta:=-(unity div 10);@/ gamma:=gamma+vadjust[ch]; delta:=delta+hadjust[ch]; delta:=delta+fix_tfm(T_width(c)); vf_store(push);@/ if delta<>0 then vf_fix(right1,delta);@/ if gamma<>0 then vf_fix(down1,gamma);@/ if a>=128 then vf_store(set1); vf_store(a);@/ vf_store(pop);@/ if c>=128 then vf_store(set1); vf_store(c);@/ end; @ Cedilla, ``dot under'', ``bar under'', and ``ogonek'' are placed below the core characters and either centered or right-aligned. Given an accent of width |a| and height |h|; and a character of width |w|, and depth |d|: We will shift the accent down by $\gamma=d+h$ (plus $0.2\times x_height$, if it is the bar under accent), and we will insert kern nodes that have the effect of either centering the accent under the character and shifting the accent to the right by $\delta={1\over2}(w-a)$, or $\delta=w-a$. If either character is absent from the font, we will simply use the other, without shifting. @p procedure make_under(@!src_accent,@!accent,@!c,@!ch: byte); var @!a,@!h,@!d,@!x,@!w,@!delta,@!gamma:fix_word; {heights and widths, as explained above} begin x:=x_height;@/ a:=fix_tfm(T_width(accent)); h:=fix_tfm(T_height(accent));@/ w:=fix_tfm(T_width(c)); d:=fix_tfm(T_depth(c));@/ if w=0 then w:=a; {default width if the character is missing} gamma:=d+h; {amount by which the accent must be shifted down}@/ case src_accent of {special positioning of accents going below a character} cedilla_accent: delta:=(w-a) div 2; {center} dot_under, bar_under: begin delta:=(w-a) div 2; {center} gamma:=gamma+2*(x_height div 5); {additional shift down} end; ogonek: delta:=w-a; {right alignment} end; {no other cases} gamma:=gamma+vadjust[ch]; delta:=delta+hadjust[ch]; vf_store(push);@/ if delta<>0 then vf_fix(right1,delta);@/ if gamma<>0 then vf_fix(down1,gamma);@/ if accent>=128 then vf_store(set1);@+ vf_store(accent);@/vf_store(pop);@/ if c>=128 then vf_store(set1);@+ vf_store(c);@/ end; @ Polish suppressed l and L are created as mere concetenations of the suppression character and the base character l, or L. Overlapping of these two characters is ensured via kerning from \.{.TFM}, so we have to go through the corresponding kerning program. @p procedure make_suppressed(@!c,@!ch: byte); label exit; var @!w,@!k,@!delta,@!gamma:fix_word; {width, kern, and moves} @!a:byte; @!i,@!j,@!r:integer; {indexes into |T_tfm|} begin a:=encode[suppress]; w:=fix_tfm(T_width(a)); {width of the suppressing bar}@/ gamma:=0; {amount by which the accent must be shifted up or down} delta:=-(unity div 10);@/ @ gamma:=gamma+vadjust[ch]; delta:=delta+hadjust[ch]; if font_encoding=ADOBE_encoding then case decode[c] of "l": begin vf_store(set1); vf_store(@'370) end; {l suppressed} "L": begin vf_store(set1); vf_store(@'350) end; {L suppressed} othercases {there ain't such an accent} begin if c>=128 then vf_store(set1); vf_store(c) end; endcases else begin vf_store(push);@/ if delta<>0 then vf_fix(right1,delta);@/ if gamma<>0 then vf_fix(down1,gamma);@/ if a>=128 then vf_store(set1); vf_store(a);@/ vf_store(pop);@/ if c>=128 then vf_store(set1); vf_store(c);@/ end; end; @ @= if T_tag(a)=lig_tag then begin i:=T_remainder(a); r:=T_lig_step(i); if T_tfm[r]>stop_flag then i:=256*T_tfm[r+2]+T_tfm[r+3]; repeat j:=T_lig_step(i); if T_tfm[j]>stop_flag then begin if 256*T_tfm[j+2]+T_tfm[j+3]>=T_nl then bad('Ligature unconditional stop command address is too big.'); end else if T_tfm[j+2]>=kern_flag then {this is a kern step} begin if T_nonexistent(T_tfm[j+1]) then if T_tfm[j+1]<>boundary_char then correct_bad_char('Kern step for')(j+1); if T_tfm[j+1]=c then begin i:=256*(T_tfm[j+2]-kern_flag)+T_tfm[j+3]; if i>=T_nk then bad('Kern index too large.') else delta:=(-fix_tfm(T_kern(i)))-w; {|-k-w|} goto exit; end end; if T_tfm[j]>0 then goto exit; if T_tfm[j]>stop_flag then i:=T_nl else i:=i+1+T_tfm[j]; until i>=T_nl; end; exit: @ Accents sometimes change the resulting character dimensions. The rest of this chapter determines the dimension corrections when needed. @= temp_b1:=encode[char_modifications[temp_byte]]; if (temp_byte<>c) and (T_width_index(temp_b1)>0) then case char_modifications[temp_byte] of goes_above: begin temp_fix:=fix_tfm(T_height(temp_b1))-x_height; temp_fix:=temp_fix-vadjust[char_modifications[temp_byte]]; end; othercases temp_fix:=0; endcases; if temp_fix<0 then temp_fix:=0; @ @= temp_b1:=encode[char_modifications[temp_byte]]; if (temp_byte<>c) and (T_width_index(temp_b1)>0) then case char_modifications[temp_byte] of cedilla_accent, ogonek: begin temp_fix:=fix_tfm(T_depth(temp_b1)); temp_fix:=temp_fix+vadjust[char_modifications[temp_byte]]; end; dot_under, bar_under: begin temp_fix:=x_height div 4; temp_fix:=temp_fix+vadjust[char_modifications[temp_byte]]; end; othercases temp_fix:=0; endcases; if temp_fix<0 then temp_fix:=0; @ The width of the accented character should be the same as that of the core character. When using modern device drivers, there is absolutely no problem with this, and this section should be made void. Nonetheless, we would like to be able to process the \.{.dvi} file by the \.{DVICOPY} program, and this is the point when possible problems arise: \.{DVICOPY} may set the invisible rule in order to compensate for the difference between the resulting reference point after typesetting the body of \.{.VF} character and the assuming resulting right-end reference point as derived from the character width from \.{.TFM}. Hence, when the base character actually typeset by \.{ACCENTS} differs from the core character the accented one is associted with, we have to set the record straight. @= temp_b1:=c; if temp_byte<>c then begin if temp_b1=i_normal then temp_b1:=i_dotless else if temp_b1=j_normal then temp_b1:=j_dotless else if font_encoding=ADOBE_encoding then begin if char_modifications[temp_byte]=suppress then case temp_b1 of "l": temp_b1:=@'370; {l suppressed} "L": temp_b1:=@'350; {L suppressed} endcases; end; if T_width_index(temp_b1)>0 then temp_fix:=fix_tfm(T_width(temp_b1))-fix_tfm(T_width(c)); end; if temp_fix<0 then temp_fix:=0; {O.K. -- we give up!!} @* The checking and massaging phase. Once the whole \.{VPL} file has been read in, we must check it for consistency and correct any errors. This process consists mainly of running through the characters that exist and seeing if they refer to characters that don't exist. We also compute the true value of |seven_unsafe|; we make sure that the charlists and ligature programs contain no loops; and we shorten the lists of widths, heights, depths, and italic corrections, if necessary, to keep from exceeding the required maximum sizes. @= @!seven_unsafe:boolean; {do seven-bit characters generate eight-bit ones?} @ @= if nl>0 then @; seven_unsafe:=false; for c:=0 to 255 do if char_wd[c]<>0 then @; if bchar_label<@'77777 then begin c:=256; @; end; if seven_bit_safe_flag and seven_unsafe then message('The font is not really seven-bit-safe!'); @.The font is not...safe@> @; @; for c:=0 to 255 do @; @ @ The checking that we need in several places is accomplished by three macros that are only slightly tricky. @d existence_tail(#)==begin char_wd[g]:=sort_in(width,0); if chars_on_line>0 then print_ln(' '); chars_on_line:=0; print(#,' '); print_octal(c); print_ln(' had no CHARACTER spec.'); end; end @d check_existence_and_safety(#)==begin g:=#; if (g>=128)and(c<128) then seven_unsafe:=true; if char_wd[g]=0 then existence_tail @d check_existence(#)==begin g:=#; if char_wd[g]=0 then existence_tail @= case char_tag[c] of no_tag: do_nothing; lig_tag: @; list_tag: check_existence_and_safety(char_remainder[c]) ('The character NEXTLARGER than'); @.The character NEXTLARGER...@> ext_tag:@; end @ @= begin if exten[char_remainder[c]].b0>0 then check_existence_and_safety(exten[char_remainder[c]].b0) ('TOP piece of character'); @.TOP piece of character...@> if exten[char_remainder[c]].b1>0 then check_existence_and_safety(exten[char_remainder[c]].b1) ('MID piece of character'); @.MID piece of character...@> if exten[char_remainder[c]].b2>0 then check_existence_and_safety(exten[char_remainder[c]].b2) ('BOT piece of character'); @.BOT piece of character...@> check_existence_and_safety(exten[char_remainder[c]].b3) ('REP piece of character'); @.REP piece of character...@> end @ @= if char_tag[c]=list_tag then begin g:=char_remainder[c]; while (g0 then print_ln(' '); chars_on_line:=0; print('A cycle of NEXTLARGER characters has been broken at '); @.A cycle of NEXTLARGER...@> print_octal(c); print_ln('.'); end; end @ @= @!delta:fix_word; {size of the intervals needed for rounding} @ @d round_message(#)==if delta>0 then message('I had to round some ', @.I had to round...@> #,'s by ',(((delta+1) div 2)/@'4000000):1:7,' units.') @= delta:=shorten(width,255); set_indices(width,delta); round_message('width');@/ delta:=shorten(height,15); set_indices(height,delta); round_message('height');@/ delta:=shorten(depth,15); set_indices(depth,delta); round_message('depth');@/ delta:=shorten(italic,63); set_indices(italic,delta); round_message('italic correction'); @ @d clear_lig_kern_entry== {make an unconditional \.{STOP}} lig_kern[nl].b0:=255; lig_kern[nl].b1:=0; lig_kern[nl].b2:=0; lig_kern[nl].b3:=0 @= begin if bchar_label<@'77777 then {make room for it} begin clear_lig_kern_entry; incr(nl); end; {|bchar_label| will be stored later} while min_nl>nl do begin clear_lig_kern_entry; incr(nl); end; if lig_kern[nl-1].b0=0 then lig_kern[nl-1].b0:=stop_flag; end @ It's not trivial to check for infinite loops generated by repeated insertion of ligature characters. But fortunately there is a nice algorithm for such testing, copied here from the program \.{TFtoPL} where it is explained further. @d simple=0 {$f(x,y)=z$} @d left_z=1 {$f(x,y)=f(z,y)$} @d right_z=2 {$f(x,y)=f(x,z)$} @d both_z=3 {$f(x,y)=f(f(x,z),y)$} @d pending=4 {$f(x,y)$ is being evaluated} @ @= @!lig_ptr:0..max_lig_steps; {an index into |lig_kern|} @!hash:array[0..hash_size] of 0..66048; {$256x+y+1$ for $x\le257$ and $y\le255$} @!class:array[0..hash_size] of simple..pending; @!lig_z:array[0..hash_size] of 0..257; @!hash_ptr:0..hash_size; {the number of nonzero entries in |hash|} @!hash_list:array[0..hash_size] of 0..hash_size; {list of those nonzero entries} @!h,@!hh:0..hash_size; {indices into the hash table} @!tt:indx; {temporary register} @!x_lig_cycle,@!y_lig_cycle:0..256; {problematic ligature pair} @ @= hash_ptr:=0; y_lig_cycle:=256; for k:=0 to hash_size do hash[k]:=0; @ @d lig_exam==lig_kern[lig_ptr].b1 @d lig_gen==lig_kern[lig_ptr].b3 @= begin lig_ptr:=char_remainder[c]; repeat if hash_input(lig_ptr,c) then begin if lig_kern[lig_ptr].b2bchar then check_existence(lig_exam)('LIG character examined by'); @.LIG character examined...@> check_existence(lig_gen)('LIG character generated by'); @.LIG character generated...@> if lig_gen>=128 then if(c<128)or(c=256) then if(lig_exam<128)or(lig_exam=bchar) then seven_unsafe:=true; end else if lig_exam<>bchar then check_existence(lig_exam)('KRN character examined by'); @.KRN character examined...@> end; if lig_kern[lig_ptr].b0>=stop_flag then lig_ptr:=nl else lig_ptr:=lig_ptr+1+lig_kern[lig_ptr].b0; until lig_ptr>=nl; end @ The |hash_input| procedure is copied from \.{TFtoPL}, but it is made into a boolean function that returns |false| if the ligature command was masked by a previous one. @p function hash_input(@!p,@!c:indx):boolean; {enter data for character |c| and command in location |p|, unless it isn't new} label 30; {go here for a quick exit} var @!cc:simple..both_z; {class of data being entered} @!zz:0..255; {function value or ligature character being entered} @!y:0..255; {the character after the cursor} @!key:integer; {value to be stored in |hash|} @!t:integer; {temporary register for swapping} begin if hash_ptr=hash_size then begin hash_input:=false; goto 30;@+end; @; key:=256*c+y+1; h:=(1009*key) mod hash_size; while hash[h]>0 do begin if hash[h]<=key then begin if hash[h]=key then begin hash_input:=false; goto 30; {unused ligature command} end; t:=hash[h]; hash[h]:=key; key:=t; {do ordered-hash-table insertion} t:=class[h]; class[h]:=cc; cc:=t; {namely, do a swap} t:=lig_z[h]; lig_z[h]:=zz; zz:=t; end; if h>0 then decr(h)@+else h:=hash_size; end; hash[h]:=key; class[h]:=cc; lig_z[h]:=zz; incr(hash_ptr); hash_list[hash_ptr]:=h; hash_input:=true; 30:end; @ @= y:=lig_kern[p].b1; t:=lig_kern[p].b2; cc:=simple; zz:=lig_kern[p].b3; if t>=kern_flag then zz:=y else begin case t of 0,6:do_nothing; {\.{LIG},\.{/LIG>}} 5,11:zz:=y; {\.{LIG/>}, \.{/LIG/>>}} 1,7:cc:=left_z; {\.{LIG/}, \.{/LIG/>}} 2:cc:=right_z; {\.{/LIG}} 3:cc:=both_z; {\.{/LIG/}} end; {there are no other cases} end @ (More good stuff from \.{TFtoPL}.) @p function f(@!h,@!x,@!y:indx):indx; forward;@t\2@> {compute $f$ for arguments known to be in |hash[h]|} function eval(@!x,@!y:indx):indx; {compute $f(x,y)$ with hashtable lookup} var @!key:integer; {value sought in hash table} begin key:=256*x+y+1; h:=(1009*key) mod hash_size; while hash[h]>key do if h>0 then decr(h)@+else h:=hash_size; if hash[h]= if hash_ptrsimple then {make sure $f$ is well defined} tt:=f(tt,(hash[tt]-1)div 256,(hash[tt]-1)mod 256); end; if(hash_ptr=hash_size)or(y_lig_cycle<256) then begin if hash_ptr0 then print_ln(' '); chars_on_line:=0; print('Infinite ligature loop starting with '); @.Infinite ligature loop...@> if x_lig_cycle=256 then print('boundary')@+else print_octal(x_lig_cycle); print(' and '); print_octal(y_lig_cycle); print_ln('!'); end else message('Sorry, I haven''t room for so many ligature/kern pairs!'); @.Sorry, I haven't room...@> print_ln('All ligatures will be cleared.'); for c:=0 to 255 do if char_tag[c]=lig_tag then begin char_tag[c]:=no_tag; char_remainder[c]:=0; end; nl:=0; bchar:=256; bchar_label:=@'77777; end @ The lig/kern program may still contain references to nonexistent characters, if parts of that program are never used. Similarly, there may be extensible characters that are never used, because they were overridden by \.{NEXTLARGER}, say. This would produce an invalid \.{TFM} file; so we must fix such errors. @d double_check_tail(#)==@t\1@>if char_wd[0]=0 then char_wd[0]:=sort_in(width,0); if chars_on_line>0 then print_ln(' '); chars_on_line:=0; print('Unused ',#,' refers to nonexistent character '); print_octal(c); print_ln('!'); end; end @d double_check_lig(#)==begin c:=lig_kern[lig_ptr].#; if char_wd[c]=0 then if c<>bchar then begin lig_kern[lig_ptr].#:=0; double_check_tail @d double_check_ext(#)==begin c:=exten[g].#; if c>0 then if char_wd[c]=0 then begin exten[g].#:=0; double_check_tail @d double_check_rep(#)==begin c:=exten[g].#; if char_wd[c]=0 then begin exten[g].#:=0; double_check_tail @= if nl>0 then for lig_ptr:=0 to nl-1 do if lig_kern[lig_ptr].b2 @.Unused KRN step...@> if ne>0 then for g:=0 to ne-1 do begin double_check_ext(b0)('VARCHAR TOP'); double_check_ext(b1)('VARCHAR MID'); double_check_ext(b2)('VARCHAR BOT'); double_check_rep(b3)('VARCHAR REP'); @.Unused VARCHAR...@> end @* The TFM output phase. Now that we know how to get all of the font data correctly stored in \.{ACCENTS}'s memory, it only remains to write the answers out. First of all, it is convenient to have an abbreviation for output to the \.{TFM} file: @d out(#)==write(tfm_file,#) @ The general plan for producing \.{TFM} files is long but simple: @= @; @; @; @; @; @; @; @ @ A \.{TFM} file begins with 12 numbers that tell how big its subfiles are. We already know most of these numbers; for example, the number of distinct widths is |memory[width]+1|, where the $+1$ accounts for the zero width that is always supposed to be present. But we still should compute the beginning and ending character codes (|bc| and |ec|), the number of header words (|lh|), and the total number of words in the \.{TFM} file (|lf|). @= @!bc:byte; {the smallest character code in the font} @!ec:byte; {the largest character code in the font} @!lh:byte; {the number of words in the header block} @!lf:0..32767; {the number of words in the entire \.{TFM} file} @!not_found:boolean; {has a font character been found?} @!temp_width:fix_word; {width being used to compute a check sum} @ It might turn out that no characters exist at all. But \.{ACCENTS} keeps going and writes the \.{TFM} anyway. In this case |ec| will be~0 and |bc| will be~1. @= lh:=header_ptr div 4;@/ not_found:=true; bc:=0; while not_found do if (char_wd[bc]>0)or(bc=255) then not_found:=false else incr(bc); not_found:=true; ec:=255; while not_found do if (char_wd[ec]>0)or(ec=0) then not_found:=false else decr(ec); if bc>ec then bc:=1; incr(memory[width]); incr(memory[height]); incr(memory[depth]); incr(memory[italic]);@/ @; lf:=6+lh+(ec-bc+1)+memory[width]+memory[height]+memory[depth]+ memory[italic]+nl+lk_offset+nk+ne+np; @ @d out_size(#)==out((#) div 256); out((#) mod 256) @= out_size(lf); out_size(lh); out_size(bc); out_size(ec); out_size(memory[width]); out_size(memory[height]); out_size(memory[depth]); out_size(memory[italic]); out_size(nl+lk_offset); out_size(nk); out_size(ne); out_size(np); @ The routines that follow need a few temporary variables of different types. @= @!j:0..max_header_bytes; {index into |header_bytes|} @!p:pointer; {index into |memory|} @!q:width..italic; {runs through the list heads for dimensions} @!par_ptr:0..max_param_words; {runs through the parameters} @!chk_sum0,@!chk_sum1,@!chk_sum2,@!chk_sum3: byte; {four bytes for the check sum computation} @ The header block follows the subfile sizes. The necessary information all appears in |header_bytes|, except that the design size and the seven-bit-safe flag must still be set. @= if font_encoding=ADOBE_encoding then begin header_bytes[8]:=27; header_bytes[25]:=" "; header_bytes[26]:="f"; header_bytes[27]:="r"; header_bytes[28]:="o"; header_bytes[29]:="m"; header_bytes[30]:=" "; header_bytes[31]:="A"; header_bytes[32]:="D"; header_bytes[33]:="O"; header_bytes[34]:="B"; header_bytes[35]:="E"; end; if not check_sum_specified then @; header_bytes[design_size_loc]:=design_size div @'100000000; {this works since |design_size>0|} header_bytes[design_size_loc+1]:=(design_size div @'200000) mod 256; header_bytes[design_size_loc+2]:=(design_size div 256) mod 256; header_bytes[design_size_loc+3]:=design_size mod 256; if not seven_unsafe then header_bytes[seven_flag_loc]:=128; for j:=0 to header_ptr-1 do out(header_bytes[j]); @ @= begin chk_sum0:=bc; chk_sum1:=ec; chk_sum2:=bc; chk_sum3:=ec; for c:=bc to ec do if char_wd[c]>0 then begin temp_width:=memory[char_wd[c]]; if design_units<>unity then temp_width:=round((temp_width/design_units)*1048576.0); temp_width:=temp_width + (c+4)*@'20000000; {this should be positive} chk_sum0:=(chk_sum0+chk_sum0+temp_width) mod 255; chk_sum1:=(chk_sum1+chk_sum1+temp_width) mod 253; chk_sum2:=(chk_sum2+chk_sum2+temp_width) mod 251; chk_sum3:=(chk_sum3+chk_sum3+temp_width) mod 247; end; header_bytes[check_sum_loc]:=chk_sum0; header_bytes[check_sum_loc+1]:=chk_sum1; header_bytes[check_sum_loc+2]:=chk_sum2; header_bytes[check_sum_loc+3]:=chk_sum3; end @ The next block contains packed |char_info|. @= index[0]:=0; for c:=bc to ec do begin out(index[char_wd[c]]); out(index[char_ht[c]]*16+index[char_dp[c]]); out(index[char_ic[c]]*4+char_tag[c]); out(char_remainder[c]); end @ When a scaled quantity is output, we may need to divide it by |design_units|. The following subroutine takes care of this, using floating point arithmetic only if |design_units<>1.0|. @p procedure out_scaled(x:fix_word); {outputs a scaled |fix_word|} var @!n:byte; {the first byte after the sign} @!m:0..65535; {the two least significant bytes} begin if abs(x/design_units)>=16.0 then begin message('The relative dimension ',x/@'4000000:1:3, ' is too large.'); @.The relative dimension...@> print(' (Must be less than 16*designsize'); if design_units<>unity then print(' =',design_units/@'200000:1:3, ' designunits'); print_ln(')'); x:=0; end; if design_units<>unity then x:=round((x/design_units)*1048576.0); if x<0 then begin out(255); x:=x+@'100000000; end else out(0); n:=x div @'200000; m:=x mod @'200000; out(n); out(m div 256); out(m mod 256); end; @ We have output the packed indices for individual characters. The scaled widths, heights, depths, and italic corrections are next. @= for q:=width to italic do begin out(0); out(0); out(0); out(0); {output the zero word} p:=link[q]; {head of list} while p>0 do begin out_scaled(memory[p]); p:=link[p]; end; end; @ One embarrassing problem remains: The ligature/kern program might be very long, but the starting addresses in |char_remainder| can be at most~255. Therefore we need to output some indirect address information; we want to compute |lk_offset| so that addition of |lk_offset| to all remainders makes all but |lk_offset| distinct remainders less than~256. For this we need a sorted table of all relevant remainders. @= @!v_label_table:array[0..256] of record @!rr: -1..@'77777; {sorted label values} @!cc: byte; {associated characters} end; @!v_label_ptr:0..256; {index of highest entry in |v_label_table|} @!v_sort_ptr:0..256; {index into |v_label_table|} @!lk_offset:0..256; {smallest offset value that might work} @!t:0..@'77777; {label value that is being redirected} @!extra_loc_needed:boolean; {do we need a special word for |bchar|?} @ @= @; if bchar<256 then begin extra_loc_needed:=true; lk_offset:=1; end else begin extra_loc_needed:=false; lk_offset:=0; end; @; if bchar_label<@'77777 then begin lig_kern[nl-1].b2:=(bchar_label+lk_offset)div 256; lig_kern[nl-1].b3:=(bchar_label+lk_offset)mod 256; end @ @= v_label_ptr:=0; v_label_table[0].rr:=-1; {sentinel} for c:=bc to ec do if char_tag[c]=lig_tag then begin v_sort_ptr:=v_label_ptr; {there's a hole at position |v_sort_ptr+1|} while v_label_table[v_sort_ptr].rr>char_remainder[c] do begin v_label_table[v_sort_ptr+1]:=v_label_table[v_sort_ptr]; decr(v_sort_ptr); {move the hole} end; v_label_table[v_sort_ptr+1].cc:=c; v_label_table[v_sort_ptr+1].rr:=char_remainder[c]; incr(v_label_ptr); end @ @= begin v_sort_ptr:=v_label_ptr; {the largest unallocated label} if v_label_table[v_sort_ptr].rr+lk_offset > 255 then begin lk_offset:=0; extra_loc_needed:=false; {location 0 can do double duty} repeat char_remainder[v_label_table[v_sort_ptr].cc]:=lk_offset; while v_label_table[v_sort_ptr-1].rr=v_label_table[v_sort_ptr].rr do begin decr(v_sort_ptr); char_remainder[v_label_table[v_sort_ptr].cc]:=lk_offset; end; incr(lk_offset); decr(v_sort_ptr); until lk_offset+v_label_table[v_sort_ptr].rr<256; {N.B.: |lk_offset=256| satisfies this when |v_sort_ptr=0|} end; if lk_offset>0 then while v_sort_ptr>0 do begin char_remainder[v_label_table[v_sort_ptr].cc]:= char_remainder[v_label_table[v_sort_ptr].cc]+lk_offset; decr(v_sort_ptr); end; end @ @= if extra_loc_needed then {|lk_offset=1|} begin out(255); out(bchar); out(0); out(0); end else for v_sort_ptr:=1 to lk_offset do {output the redirection specs} begin t:=v_label_table[v_label_ptr].rr; if bchar<256 then begin out(255); out(bchar); end else begin out(254); out(0); end; out_size(t+lk_offset); repeat decr(v_label_ptr); until v_label_table[v_label_ptr].rr0 then for lig_ptr:=0 to nl-1 do begin out(lig_kern[lig_ptr].b0); out(lig_kern[lig_ptr].b1); out(lig_kern[lig_ptr].b2); out(lig_kern[lig_ptr].b3); end; if nk>0 then for krn_ptr:=0 to nk-1 do out_scaled(kern[krn_ptr]) @ @= if ne>0 then for c:=0 to ne-1 do begin out(exten[c].b0); out(exten[c].b1); out(exten[c].b2); out(exten[c].b3); end; @ For our grand finale, we wind everything up by outputting the parameters. @= for par_ptr:=1 to np do begin if par_ptr=1 then @ else out_scaled(param[par_ptr]); end @ @= begin if param[1]<0 then begin param[1]:=param[1]+@'10000000000; out((param[1] div @'100000000)+256-64); end else out(param[1] div @'100000000); out((param[1] div @'200000) mod 256); out((param[1] div 256) mod 256); out(param[1] mod 256); end @* The VF output phase. Output to |vf_file| is considerably simpler. @d id_byte=202 {current version of \.{VF} format} @d vout(#)==write(vf_file,#) @= @!vcount:integer; {number of bytes written to |vf_file|} @ We need a routine to output integers as four bytes. Negative values will never be less than $-2^{24}$. @p procedure vout_int(@!x:integer); begin if x>=0 then vout(x div @'100000000) else begin vout(255); x:=x+@'100000000; end; vout((x div @'200000) mod 256); vout((x div @'400) mod 256); vout(x mod 256); end; @ @= vout(pre); vout(id_byte); vout(vtitle_length); for k:=0 to vtitle_length-1 do vout(vf[vtitle_start+k]); for k:=check_sum_loc to design_size_loc+3 do vout(header_bytes[k]); vcount:=vtitle_length+11; @; for c:=bc to ec do if char_wd[c]>0 then @; repeat vout(post); incr(vcount); until vcount mod 4 = 0 @ @= begin vout(fnt_def1); vout(0); {font number=0}@/ vout_int(font_checksum); vout_int(font_at); vout_int(font_dsize); vout(farea_length); vout(fname_length); for k:=0 to farea_length-1 do vout(vf[farea_start+k]); for k:=0 to fname_length-1 do vout(vf[fname_start+k]); vcount:=vcount+12+farea_length+fname_length; end @ @= begin x:=memory[char_wd[c]]; if design_units<>unity then x:=round((x/design_units)*1048576.0); if (packet_length[c]>241)or(x<0)or(x>=@'100000000) then begin vout(242); vout_int(packet_length[c]); vout_int(c); vout_int(x); vcount:=vcount+13+packet_length[c]; end else begin vout(packet_length[c]); vout(c); vout(x div @'200000); vout((x div @'400) mod 256); vout(x mod 256); vcount:=vcount+5+packet_length[c]; end; if packet_start[c]=vf_size then begin if c>=128 then vout(set1); vout(c); end else for k:=0 to packet_length[c]-1 do vout(vf[packet_start[c]+k]); end @* Auxiliary input routines. @= @!adj_file:text; @!adj_present:boolean; @ On some systems you may can do something special to test a presence of a file to be opened. The following code should test whether the adjustment value file does not exist -- in such a case the |adj_present| variable will be set to |false|, and the auxiliary input will be ignored. @^system dependencies@> @= reset(adj_file); adj_present:=true; {set to false if default adjustment file not found} @ For the purposes of this program, a |byte| is an unsigned eight-bit quantity, and an |ASCII_code| is an integer between @'40 and @'177. Such ASCII codes correspond to one-character constants like \.{"A"} in \.{WEB} language. @= @!ASCII_code=@'40..@'177; {standard ASCII code numbers} @ One of the things \.{ACCENTS} has to do is convert characters of strings to ASCII form, since that is the code used for the family name and the coding scheme in a \.{TFM} file. An array |xord| is used to do the conversion from |char|; the method below should work with little or no change on most \PASCAL\ systems. @^system dependencies@> @d first_ord=0 {ordinal number of the smallest element of |char|} @d last_ord=127 {ordinal number of the largest element of |char|} @= @!xord:array[char] of ASCII_code; {conversion table} @ Characters that should not appear in \.{VPL} files (except in comments) are mapped into @'177. @d invalid_code=@'177 {code deserving an error message} @= for k:=first_ord to last_ord do xord[chr(k)]:=invalid_code; xord[' ']:=" "; xord['!']:="!"; xord['"']:=""""; xord['#']:="#"; xord['$']:="$"; xord['%']:="%"; xord['&']:="&"; xord['''']:="'"; xord['(']:="("; xord[')']:=")"; xord['*']:="*"; xord['+']:="+"; xord[',']:=","; xord['-']:="-"; xord['.']:="."; xord['/']:="/"; xord['0']:="0"; xord['1']:="1"; xord['2']:="2"; xord['3']:="3"; xord['4']:="4"; xord['5']:="5"; xord['6']:="6"; xord['7']:="7"; xord['8']:="8"; xord['9']:="9"; xord[':']:=":"; xord[';']:=";"; xord['<']:="<"; xord['=']:="="; xord['>']:=">"; xord['?']:="?"; xord['@@']:="@@"; xord['A']:="A"; xord['B']:="B"; xord['C']:="C"; xord['D']:="D"; xord['E']:="E"; xord['F']:="F"; xord['G']:="G"; xord['H']:="H"; xord['I']:="I"; xord['J']:="J"; xord['K']:="K"; xord['L']:="L"; xord['M']:="M"; xord['N']:="N"; xord['O']:="O"; xord['P']:="P"; xord['Q']:="Q"; xord['R']:="R"; xord['S']:="S"; xord['T']:="T"; xord['U']:="U"; xord['V']:="V"; xord['W']:="W"; xord['X']:="X"; xord['Y']:="Y"; xord['Z']:="Z"; xord['[']:="["; xord['\']:="\"; xord[']']:="]"; xord['^']:="^"; xord['_']:="_"; xord['`']:="`"; xord['a']:="a"; xord['b']:="b"; xord['c']:="c"; xord['d']:="d"; xord['e']:="e"; xord['f']:="f"; xord['g']:="g"; xord['h']:="h"; xord['i']:="i"; xord['j']:="j"; xord['k']:="k"; xord['l']:="l"; xord['m']:="m"; xord['n']:="n"; xord['o']:="o"; xord['p']:="p"; xord['q']:="q"; xord['r']:="r"; xord['s']:="s"; xord['t']:="t"; xord['u']:="u"; xord['v']:="v"; xord['w']:="w"; xord['x']:="x"; xord['y']:="y"; xord['z']:="z"; xord['{']:="{"; xord['|']:="|"; xord['}']:="}"; xord['~']:="~"; @ In order to help catch errors of badly nested parentheses, \.{ACCENTS} assumes that the user will begin each line with a number of blank spaces equal to some constant times the number of open parentheses at the beginning of that line. However, the program doesn't know in advance what the constant is, nor does it want to print an error message on every line for a user who has followed no consistent pattern of indentation. Therefore the following strategy is adopted: If the user has been consistent with indentation for ten or more lines, an indentation error will be reported. The constant of indentation is reset on every line that should have nonzero indentation. @= @!line:integer; {the number of the current line} @!good_indent:integer; {the number of lines since the last bad indentation} @!indent: integer; {the number of spaces per open parenthesis, zero if unknown} @!level: integer; {the current number of open parentheses} @ @= line:=0; good_indent:=0; indent:=0; level:=0; @ The input need not really be broken into lines of any maximum length, and we could read it character by character without any buffering. But we shall place it into a small buffer so that offending lines can be displayed in error messages. @= @!left_ln,@!right_ln:boolean; {are the left and right ends of the buffer at end-of-line marks?} @!limit:0..buf_size; {position of the last character present in the buffer} @!loc:0..buf_size; {position of the last character read in the buffer} @!buffer:array[1..buf_size] of char; @!input_has_ended:boolean; {there is no more input to read} @ @= limit:=0; loc:=0; left_ln:=true; right_ln:=true; input_has_ended:=false; @ The following routine prints an error message and an indication of where the error was detected. The error message should not include any final punctuation, since this procedure supplies its own. @d err_print(#)==begin if chars_on_line>0 then print_ln(' '); print(#); show_error_context; end @p procedure show_error_context; {prints the current scanner location} var k:0..buf_size; {an index into |buffer|} begin print_ln(' (line ',line:1,').'); if not left_ln then print('...'); for k:=1 to loc do print(buffer[k]); {print the characters already scanned} print_ln(' '); if not left_ln then print(' '); for k:=1 to loc do print(' '); {space out the second line} for k:=loc+1 to limit do print(buffer[k]); {print the characters yet unseen} if right_ln then print_ln(' ')@+else print_ln('...'); chars_on_line:=0; end; @ Here is a procedure that does the right thing when we are done reading the present contents of the buffer. It keeps |buffer[buf_size]| empty, in order to avoid range errors on certain \PASCAL\ compilers. An infinite sequence of right parentheses is placed at the end of the file, so that the program is sure to get out of whatever level of nesting it is in. On some systems it is desirable to modify this code so that tab marks in the buffer are replaced by blank spaces. (Simply setting |xord[chr(@'11)]:=" "| would not work; for example, two-line error messages would not come out properly aligned.) @^system dependencies@> @p procedure fill_buffer; begin left_ln:=right_ln; limit:=0; loc:=0; if left_ln then begin if line>0 then read_ln(adj_file); incr(line); end; if eof(adj_file) then begin limit:=1; buffer[1]:=')'; right_ln:=false; input_has_ended:=true; end else begin while (limit; end; end; @ The interesting part about |fill_buffer| is the part that learns what indentation conventions the user is following, if any. @d bad_indent(#)==begin if good_indent>=10 then err_print(#); good_indent:=0; indent:=0; end @= begin while (loc else if indent=0 then if loc mod level=0 then begin indent:=loc div level; good_indent:=1; end else good_indent:=0 else if indent*level=loc then incr(good_indent) else bad_indent('Warning: Inconsistent indentation; ', @.Warning: Inconsistent indentation...@> 'you are at parenthesis level ',level:1); end; end @* Basic scanning routines. The global variable |cur_char| holds the ASCII code corresponding to the character most recently read from the input buffer, or to a character that has been substituted for the real one. @= @!cur_char:ASCII_code; {we have just read this} @ Here is a procedure that sets |cur_char| to an ASCII code for the next character of input, if that character is a letter or digit or slash or \.>. Otherwise it sets |cur_char:=" "|, and the input system will be poised to reread the character that was rejected, whether or not it was a space. Lower case letters are converted to upper case. @p procedure get_keyword_char; begin while (loc=limit)and(not right_ln) do fill_buffer; if loc=limit then cur_char:=" " {end-of-line counts as a delimiter} else begin cur_char:=xord[buffer[loc+1]]; if cur_char>="a" then cur_char:=cur_char-@'40; if ((cur_char>="0")and(cur_char<="9")) then incr(loc) else if ((cur_char>="A")and(cur_char<="Z")) then incr(loc) else if cur_char="/" then incr(loc) else if cur_char=">" then incr(loc) else cur_char:=" "; end; end; @ The following procedure sets |cur_char| to the next character code, and converts lower case to upper case. If the character is a left or right parenthesis, it will not be ``digested''; the character will be read again and again, until the calling routine does something like `|incr(loc)|' to get past it. Such special treatment of parentheses insures that the structural information they contain won't be lost in the midst of other error recovery operations. @d backup==begin if (cur_char>")")or(cur_char<"(") then decr(loc); end {undoes the effect of |get_next|} @p procedure get_next; {sets |cur_char| to next, balks at parentheses} begin while loc=limit do fill_buffer; incr(loc); cur_char:=xord[buffer[loc]]; if cur_char>="a" then if cur_char<="z" then cur_char:=cur_char-@'40 {uppercasify} else begin if cur_char=invalid_code then begin err_print('Illegal character in the file'); @.Illegal character...@> cur_char:="?"; end; end else if (cur_char<=")")and(cur_char>="(") then decr(loc); end; @ The next procedure is used to ignore the text of a comment, or to pass over erroneous material. As such, it has the privilege of passing parentheses. It stops after the first right parenthesis that drops the level below the level in force when the procedure was called. @p procedure skip_to_end_of_item; var l:integer; {initial value of |level|} begin l:=level; while level>=l do begin while loc=limit do fill_buffer; incr(loc); if buffer[loc]=')' then decr(level) else if buffer[loc]='(' then incr(level); end; if input_has_ended then err_print('File ended unexpectedly: No closing ")"'); @.File ended unexpectedly...@> cur_char:=" "; {now the right parenthesis has been read and digested} end; @ A similar procedure copies the bytes remaining in an item. The copied bytes go into the array |vf| with leading blanks ignored. @p procedure copy_to_end_of_item; label 30; var l:integer; {initial value of |level|} @!nonblank_found:boolean; {have we seen a nonblank character yet?} begin l:=level; nonblank_found:=false; while true do begin while loc=limit do fill_buffer; if buffer[loc+1]=')' then if level=l then goto 30@+else decr(level); incr(loc); if buffer[loc]='(' then incr(level); if buffer[loc]<>' ' then nonblank_found:=true; if nonblank_found then if xord[buffer[loc]]=invalid_code then begin err_print('Illegal character in the file'); @.Illegal character...@> vf_store("?"); end else vf_store(xord[buffer[loc]]); end; 30:end; @ Sometimes we merely want to skip past characters in the input until we reach a left or a right parenthesis. For example, we do this whenever we have finished scanning a property value and we hope that a right parenthesis is next (except for possible blank spaces). @d skip_to_paren==repeat get_next@;@+ until (cur_char="(")or(cur_char=")") @d skip_error(#)==begin err_print(#); skip_to_paren; end {this gets to the right parenthesis if something goes wrong} @d flush_error(#)==begin err_print(#); skip_to_end_of_item; end {this gets past the right parenthesis if something goes wrong} @ After a property value has been scanned, we want to move just past the right parenthesis that should come next in the input (except for possible blank spaces). @p procedure finish_the_property; {do this when the value has been scanned} begin while cur_char=" " do get_next; if cur_char<>")" then err_print('Junk after property value will be ignored'); @.Junk after property value...@> skip_to_end_of_item; end; @* Scanning property names. We have to figure out the meaning of names that appear in the \.{VPL} file, by looking them up in a dictionary of known keywords. Keyword number $n$ appears in locations |start[n]| through |start[n+1]-1| of an array called |dictionary|. @d max_name_index=20 {upper bound on the number of keywords} @d max_letters=150 {upper bound on the total length of all keywords} @= @!start:array[1..max_name_index] of 0..max_letters; @!dictionary:array[0..max_letters] of ASCII_code; @!start_ptr:0..max_name_index; {the first available place in |start|} @!dict_ptr:0..max_letters; {the first available place in |dictionary|} @ @= start_ptr:=1; start[1]:=0; dict_ptr:=0; @ When we are looking for a name, we put it into the |cur_name| array. When we have found it, the corresponding |start| index will go into the global variable |name_ptr|. @d longest_name=20 {length of \.{DEFAULTRULETHICKNESS}} @= @!cur_name:array[1..longest_name] of ASCII_code; {a name to look up} @!name_lng:0..longest_name; {its length} @!name_ptr:0..max_name_index; {its ordinal number in the dictionary} @ A conventional hash table with linear probing (cf.\ Algorithm 6.4L in {\sl The Art of Computer Pro\-gram\-ming\/}) is used for the dictionary operations. If |nhash[h]=0|, the table position is empty, otherwise |nhash[h]| points into the |start| array. @d hash_prime=141 {size of the hash table} @= @!nhash:array[0..hash_prime-1] of 0..max_name_index; @!cur_hash:0..hash_prime-1; {current position in the hash table} @ @= @!h:0..hash_prime-1; {runs through the hash table} @ @= for h:=0 to hash_prime-1 do nhash[h]:=0; @ Since there is no chance of the hash table overflowing, the procedure is very simple. After |lookup| has done its work, |cur_hash| will point to the place where the given name was found, or where it should be inserted. @p procedure lookup; {finds |cur_name| in the dictionary} var k:0..longest_name; {index into |cur_name|} @!j:0..max_letters; {index into |dictionary|} @!not_found:boolean; {clumsy thing necessary to avoid |goto| statement} begin @; not_found:=true; while not_found do begin if cur_hash=0 then cur_hash:=hash_prime-1@+else decr(cur_hash); if nhash[cur_hash]=0 then not_found:=false else begin j:=start[nhash[cur_hash]]; if start[nhash[cur_hash]+1]=j+name_lng then begin not_found:=false; for k:=1 to name_lng do if dictionary[j+k-1]<>cur_name[k] then not_found:=true; end; end; end; name_ptr:=nhash[cur_hash]; end; @ @= cur_hash:=cur_name[1]; for k:=2 to name_lng do cur_hash:=(cur_hash+cur_hash+cur_name[k]) mod hash_prime @ The ``meaning'' of the keyword that begins at |start[k]| in the dictionary is kept in |equiv[k]|. The numeric |equiv| codes are given symbolic meanings by the following definitions. @d comment_code=0 @d design_units_code=1 @d character_code=2 @d vtitle_code=3 @d User_def_scheme_code=4 @d global_code=5 {this is the last one accepted at outer level} @d left_code=6 @d right_code=7 @d up_code=8 @d down_code=9 @= @!equiv:array[0..max_name_index] of byte; @!cur_code:byte; {equivalent most recently found in |equiv|} @ We have to get the keywords into the hash table and into the dictionary in the first place (sigh). The procedure that does this has the desired |equiv| code as a parameter. In order to facilitate \.{WEB} macro writing for the initialization, the keyword being initialized is placed into the last positions of |cur_name|, instead of the first positions. @p procedure enter_name(v:byte); {|cur_name| goes into the dictionary} var k:0..longest_name; begin for k:=1 to name_lng do cur_name[k]:=cur_name[k+longest_name-name_lng]; {now the name has been shifted into the correct position} lookup; {this sets |cur_hash| to the proper insertion place} nhash[cur_hash]:=start_ptr; equiv[start_ptr]:=v; for k:=1 to name_lng do begin dictionary[dict_ptr]:=cur_name[k]; incr(dict_ptr); end; incr(start_ptr); start[start_ptr]:=dict_ptr; end; @ Here are the macros to load a name of up to 20 letters into the dictionary. For example, the macro |load5| is used for five-letter keywords. @d tail(#)==enter_name(#) @d t20(#)==cur_name[20]:=#;tail @d t19(#)==cur_name[19]:=#;t20 @d t18(#)==cur_name[18]:=#;t19 @d t17(#)==cur_name[17]:=#;t18 @d t16(#)==cur_name[16]:=#;t17 @d t15(#)==cur_name[15]:=#;t16 @d t14(#)==cur_name[14]:=#;t15 @d t13(#)==cur_name[13]:=#;t14 @d t12(#)==cur_name[12]:=#;t13 @d t11(#)==cur_name[11]:=#;t12 @d t10(#)==cur_name[10]:=#;t11 @d t9(#)==cur_name[9]:=#;t10 @d t8(#)==cur_name[8]:=#;t9 @d t7(#)==cur_name[7]:=#;t8 @d t6(#)==cur_name[6]:=#;t7 @d t5(#)==cur_name[5]:=#;t6 @d t4(#)==cur_name[4]:=#;t5 @d t3(#)==cur_name[3]:=#;t4 @d t2(#)==cur_name[2]:=#;t3 @d t1(#)==cur_name[1]:=#;t2 @d load2==name_lng:=2;t19 @d load3==name_lng:=3;t18 @d load4==name_lng:=4;t17 @d load5==name_lng:=5;t16 @d load6==name_lng:=6;t15 @d load7==name_lng:=7;t14 @d load8==name_lng:=8;t13 @d load9==name_lng:=9;t12 @d load10==name_lng:=10;t11 @d load11==name_lng:=11;t10 @d load12==name_lng:=12;t9 @d load13==name_lng:=13;t8 @d load14==name_lng:=14;t7 @d load15==name_lng:=15;t6 @d load16==name_lng:=16;t5 @d load17==name_lng:=17;t4 @d load18==name_lng:=18;t3 @d load19==name_lng:=19;t2 @d load20==name_lng:=20;t1 @ @= equiv[0]:=comment_code; {this is used after unknown keywords} load11("D")("E")("S")("I")("G")("N") ("U")("N")("I")("T")("S")(design_units_code);@/ load9("C")("H")("A")("R")("A")("C")("T")("E")("R")(character_code);@/ load7("C")("O")("M")("M")("E")("N")("T")(comment_code);@/ load6("V")("T")("I")("T")("L")("E")(vtitle_code);@/ load6("G")("L")("O")("B")("A")("L")(global_code);@/ load4("L")("E")("F")("T")(left_code);@/ load5("R")("I")("G")("H")("T")(right_code);@/ load2("U")("P")(up_code);@/ load4("D")("O")("W")("N")(down_code);@/ load10("U")("S")("E")("R")("S")("C")("H")("E")("M")("E")(User_def_scheme_code)@/ @ When a left parenthesis has been scanned, the following routine is used to interpret the keyword that follows, and to store the equivalent value in |cur_code|. @p procedure get_name; begin incr(loc); incr(level); {pass the left parenthesis} cur_char:=" "; while cur_char=" " do get_next; if (cur_char>")")or(cur_char<"(") then decr(loc); {back up one character} name_lng:=0; get_keyword_char; {prepare to scan the name} while cur_char<>" " do begin if name_lng=longest_name then cur_name[1]:="X" {force error} else incr(name_lng); cur_name[name_lng]:=cur_char; get_keyword_char; end; lookup; if name_ptr=0 then err_print('Sorry, I don''t know that property name'); @.Sorry, I don't know...@> cur_code:=equiv[name_ptr]; end; @* Scanning numeric data. The next thing we need is a trio of subroutines to read the one-byte, four-byte, and real numbers that may appear as property values. These subroutines are careful to stick to numbers between $-2^{31}$ and $2^{31}-1$, inclusive, so that a computer with two's complement 32-bit arithmetic will not be interrupted by overflow. @ The first number scanner, which returns a one-byte value, surely has no problems of arithmetic overflow. @p function get_byte:byte; {scans a one-byte property value} var acc:integer; {an accumulator} @!t:ASCII_code; {the type of value to be scanned} begin repeat get_next; until cur_char<>" "; {skip the blanks before the type code} t:=cur_char; acc:=0; repeat get_next; until cur_char<>" "; {skip the blanks after the type code} if t="C" then @ else if t="D" then @ else if t="O" then @ else if t="H" then @ else skip_error('You need "C" or "D" or "O" or "H" here'); @.You need "C" or "D" ...here@> cur_char:=" "; get_byte:=acc; end; @ The |get_next| routine converts lower case to upper case, but it leaves the character in the buffer, so we can unconvert it. @= if (cur_char>=@'41)and(cur_char<=@'176)and ((cur_char<"(")or(cur_char>")")) then acc:=xord[buffer[loc]] else skip_error('"C" value must be standard ASCII and not a paren') @:C value}\.{"C" value must be...@> @ @= begin while (cur_char>="0")and(cur_char<="9") do begin acc:=acc*10+cur_char-"0"; if acc>255 then begin skip_error('This value shouldn''t exceed 255'); @.This value shouldn't...@> acc:=0; cur_char:=" "; end else get_next; end; backup; end @ @= begin while (cur_char>="0")and(cur_char<="7") do begin acc:=acc*8+cur_char-"0"; if acc>255 then begin skip_error('This value shouldn''t exceed ''377'); @.This value shouldn't...@> acc:=0; cur_char:=" "; end else get_next; end; backup; end @ @= begin while ((cur_char>="0")and(cur_char<="9"))or ((cur_char>="A")and(cur_char<="F")) do begin if cur_char>="A" then cur_char:=cur_char+"0"+10-"A"; acc:=acc*16+cur_char-"0"; if acc>255 then begin skip_error('This value shouldn''t exceed "FF'); @.This value shouldn't...@> acc:=0; cur_char:=" "; end else get_next; end; backup; end @ The remaining scanning routine is the most interesting. It scans a real constant and returns the nearest |fix_word| approximation to that constant. When a real value is desired, we might as well treat `\.D' and `\.R' formats as if they were identical. @p function get_fix:fix_word; {scans a real property value} var negative:boolean; {was there a minus sign?} @!acc:integer; {an accumulator} @!int_part:integer; {the integer part} @!j:0..7; {the number of decimal places stored} begin repeat get_next; until cur_char<>" "; {skip the blanks before the type code} negative:=false; acc:=0; {start with the accumulators zero} if (cur_char<>"R")and(cur_char<>"D") then skip_error('An "R" or "D" value is needed here') @.An "R" or "D" ... needed here@> else begin @; while (cur_char>="0") and (cur_char<="9") do @; int_part:=acc; acc:=0; if cur_char="." then @; if (acc>=unity)and(int_part=2047) then skip_error('Real constants must be less than 2048') @.Real constants must be...@> else acc:=int_part*unity+acc; end; if negative then get_fix:=-acc@+else get_fix:=acc; end; @ @= repeat get_next; if cur_char="-" then begin cur_char:=" "; negative:=true; end else if cur_char="+" then cur_char:=" "; until cur_char<>" " @ @= begin acc:=acc*10+cur_char-"0"; if acc>=2048 then begin skip_error('Real constants must be less than 2048'); @.Real constants must be...@> acc:=0; cur_char:=" "; end else get_next; end @ To scan the fraction $.d_1d_2\ldots\,$, we keep track of up to seven of the digits $d_j$. A correct result is obtained if we first compute $f^\prime=\lfloor 2^{21}(d_1\ldots d_j)/10^j\rfloor$, after which $f=\lfloor(f^\prime+1)/2\rfloor$. It is possible to have $f=1.0$. @= @!fraction_digits:array[1..7] of integer; {$2^{21}$ times $d_j$} @ @= begin j:=0; get_next; while (cur_char>="0")and(cur_char<="9") do begin if j<7 then begin incr(j); fraction_digits[j]:=@'10000000*(cur_char-"0"); end; get_next; end; acc:=0; while j>0 do begin acc:=fraction_digits[j]+(acc div 10); decr(j); end; acc:=(acc+10) div 20; end @* The auxiliary input phase. We're ready now to read and parse the \.{ADJ} file, storing property values as we go. @= @!adj_fn_start:0..vf_size; {pointer to the beginning of \.{VTITLE} string} @!adj_fn_length:0..256; {length of \.{VTITLE} string} @!adj_design_units:fix_word; {design size factor for adjustments file input} @!global_vadjust,@!global_hadjust:fix_word; {global accent adjustment values} @!vadjust,@!hadjust:array[byte]of fix_word; {adjustment values for individual characters} @ @= adj_design_units:=unity; global_vadjust:=0; global_hadjust:=0; for k:=0 to 255 do begin vadjust[k]:=0; hadjust[k]:=0 end; @ @= begin cur_char:=" "; repeat while cur_char=" " do get_next; if cur_char="(" then @ else if (cur_char=")")and not input_has_ended then begin err_print('Extra right parenthesis'); incr(loc); cur_char:=" "; end @.Extra right parenthesis@> else if not input_has_ended then junk_error; until input_has_ended; for k:=0 to 255 do begin vadjust[k]:=vadjust[k]+global_vadjust; hadjust[k]:=hadjust[k]+global_hadjust; end; end @ The |junk_error| routine just referred to is called when something appears in the forbidden area between properties of a property list. @p procedure junk_error; {gets past no man's land} begin err_print('There''s junk here that is not in parentheses'); @.There's junk here...@> skip_to_paren; end; @ For each list, we are supposed to read the data from the left parenthesis that is the current value of |cur_char| to the right parenthesis that matches it in the input. The main complication is to recover with reasonable grace from various error conditions that might arise. @= begin get_name; if cur_code=comment_code then skip_to_end_of_item else if cur_code>global_code then flush_error('This property name doesn''t belong on the outer level') @.This property name doesn't belong...@> else begin @; finish_the_property; end; end @ @= case cur_code of design_units_code: @; vtitle_code: begin adj_fn_start:=vf_ptr; copy_to_end_of_item; if vf_ptr>adj_fn_start+255 then begin err_print('VTITLE clipped to 255 characters'); adj_fn_length:=255; @.VTITLE clipped...@> for k:=1 to adj_fn_length-1 do begin if k<=vtitle_length then begin if vf[k+adj_fn_start]<>vf[k+vtitle_start] then goto final_end end else if vf[k+adj_fn_start]<>" " then goto final_end end; for k:=adj_fn_length-1 to vtitle_length-1 do if vf[k+vtitle_start]<>" " then begin message('VTITLE in the adjustments input file (.ADJ) differs'); message('from the actual .VF font name.'); message('Sorry --- I have to quit now.'); goto final_end; end; end else adj_fn_length:=vf_ptr-vtitle_start; end; global_code:@; character_code:@; User_def_scheme_code: @; end @ @= begin next_d:=get_fix; if next_d<=0 then err_print('The number of units per design size must be positive') @.The number of units...@> else adj_design_units:=next_d; end @ @d finish_inner_property_list== begin decr(loc);incr(level);cur_char:=")";end; @= begin while level=1 do begin while cur_char=" " do get_next; if cur_char="(" then @ else if cur_char=")" then skip_to_end_of_item else junk_error; end; finish_inner_property_list; end @ @= begin get_name; if cur_code=comment_code then skip_to_end_of_item else begin case cur_code of down_code: global_vadjust:=round(get_fix/(adj_design_units/float_constant(1048576))); left_code: global_hadjust:=round(get_fix/(adj_design_units/float_constant(1048576))); up_code: global_vadjust:=-round(get_fix/(adj_design_units/float_constant(1048576))); right_code: global_hadjust:=-round(get_fix/(adj_design_units/float_constant(1048576))); othercases flush_error('Direction must be either of UP, DOWN, LEFT, or RIGHT.'); end; finish_the_property; end; end @ @= begin c:=get_byte; {read the character code for which the accent is being adjusted} while level=1 do begin while cur_char=" " do get_next; if cur_char="(" then @ else if cur_char=")" then skip_to_end_of_item else junk_error; end; finish_inner_property_list; end @ @= begin get_name; if cur_code=comment_code then skip_to_end_of_item else begin case cur_code of down_code: vadjust[c]:=round(get_fix/(adj_design_units/float_constant(1048576))); right_code: hadjust[c]:=round(get_fix/(adj_design_units/float_constant(1048576))); up_code: vadjust[c]:=-round(get_fix/(adj_design_units/float_constant(1048576))); left_code: hadjust[c]:=-round(get_fix/(adj_design_units/float_constant(1048576))); othercases flush_error('Direction must be either of UP, DOWN, LEFT, or RIGHT.'); end; finish_the_property; end; end @* The main program. The routines sketched out so far need to be packaged into separate procedures, on some systems, since some \PASCAL\ compilers place a strict limit on the size of a routine. The packaging is done here in an attempt to avoid some system-dependent changes. First comes the |organize| procedure, which reads the input data and get ready for subsequent events. If something goes wrong, the routine returns |false|. @p function organize:boolean; label final_end, exit; var T_tfm_ptr:T_index; {an index into |tfm|} begin @; @; @; organize:=true; return; final_end: organize:=false; exit: @@/ end; @# procedure corr_and_check; var @!c:0..256; {runs through all character codes} @!hh:0..hash_size; {an index into |hash_list|} @!lig_ptr:0..max_lig_steps; {an index into |lig_kern|} @!g:byte; {a character generated by the current character |c|} begin @ end; @# procedure vf_output; var @!c:byte; {runs through all character codes} @!k:integer; {loop index} begin @; end; @# function adjustments_input:boolean; label final_end, exit; var @!k:byte; begin @; if adj_present then @; adjustments_input:=true; return; final_end: adjustments_input:=false; exit: end; @ Here is where \.{ACCENTS} begins and ends. @p begin initialize;@/ if not organize then goto final_end;@/ if not adjustments_input then goto final_end;@/ @; @; @; @; @; @; @; @; corr_and_check; @; vf_output; final_end:end. @* System-dependent changes. This section should be replaced, if necessary, by changes to the program that are necessary to make \.{ACCENTS} work at a particular installation. It is usually best to design your change file so that all changes to previous sections preserve the section numbering; then everybody's version will be consistent with the printed program. More extensive changes, which introduce new sections, can be inserted here; then only the index itself will get a new section number. @^system dependencies@> @* Index. Pointers to error messages appear here together with the section numbers where each ident\-i\-fier is used.