\useoptex % We are using OpTeX, no LaTeX \fontfam[pagella] \report \emergencystretch=2em \hbadness=2100 \def\thed#1{\ifnum#1<10 0\fi\the#1} \let\isAleBsaved=\_isAleB \def\preprocessindex{% sort \_iilist with the rule: .word is before :word is before word \def\_isAleB ##1##2{% \edef\tmpb{\csstring ##1&\relax\csstring ##2&\relax}% \_ea\testAleB \tmpb } \ifx\_iilist\empty \else \_dosorting\_iilist \fi \let\_isAleB=\isAleBsaved } \def\testAleB #1#2#3\relax #4#5#6\relax{% \ifnum `#2<`#5 \_AleBtrue \else \_AleBfalse \fi } \catcode`<=13 \def<#1>{{\def\_dsp { }\iindex{:#1}\iis {:#1} {$\langle\hbox{\it#1}\rangle$}}% $\def\,{\hskip1.5pt plus.3pt minus1.2pt}\,\langle\hbox{\it#1}\rangle\,$} \def\l#1>{$\def\,{\hskip1.5pt plus.3pt minus1.2pt}\,\langle\hbox{\it#1}\rangle\,$} \everyintt={\catcode`<=13 \Blue} \onlyrgb \addto\_titfont\Blue \def\Green{\setrgbcolor{0 .8 0}} \def\myunderline#1{\vtop{\hbox{#1}\kern-\prevdepth \kern2pt \hrule}} \adef/#1.{\myunderline{#1}} \adef|#1;{{\Red#1}} \toksapp\everyintt{\catcode`\/=13 \catcode`\|=13} \catcode`\/=12 \catcode`\|=12 \ifx\_partokenset\undefined \def\comment#1\par{{\adef\%{\%}\Green\%#1}\par} \else \def\comment#1\_par{{\adef\%{\%}\Green\%#1}\_par} \fi \catcode`\%=14 \toksapp\everytt{\adef\%{\comment}} \def\*{*} \def\c#1{$_{#1}$} \activettchar` \enquotes % to index macros: \def\i #1 {\makedest{#1}\ii .#1 \iis .#1 {\ilink[cs:#1]{\code{\\#1}}}} \def\x`{\bgroup\_setverb\xx} \bgroup \lccode\string`\.=\string`\` \lowercase{\egroup \def\xx #1#2.{\i #2 \egroup `#1#2.}} \def\y`{\bgroup\_setverb\yy} \def\yy #1#2={\i #2 \egroup `#1#2=} \def\z`{`\let<=\l} \let\_cslinkcolor=\Blue \def\q`#1`{\link[cs:\csstring#1]\Blue{\tt\string#1}} % tex-nutshell.pdf includes destinations to the explanation of the primitive % control sequences and plain TeX macros in the form: "cs:sequence". For example, % you can try: % % http://petr.olsak.net/ftp/olsak/optex/tex-nutshell.pdf#cs:hbox % % All such sequences are listed in the tex-nutshell.eref file. You can read % this eref file into your document and create external links to these % destinations. \newwrite \eref \immediate\openout\eref=\jobname.eref \def\makedest#1{} \def\makedestactive#1{% \ifcsname cs:#1\endcsname \else \immediate\write\eref{\string\Xeref{#1}}% \dest[cs:#1]% \sxdef{cs:#1}{}% \fi } \def\noda{\def\makedest##1{}} \def\doda{\let\makedest\makedestactive} % Hyperlinks \hyperlinks\Red\Green \fnotelinks\Magenta\Magenta \tit \TeX/ in a Nutshell \author Petr Olšák The pure \TeX/ features are described here, no features provided by macro extensions. Only the last section gives a summary of plain \TeX/ macros. The main goal of this document is its brevity. So features are described only roughly and sometimes inaccurately here. If you need to know more then you can read free available books, for example \ulink[https://eijkhout.net/texbytopic/texbytopic.html]{\TeX/ by topic} or \ulink[http://petr.olsak.net/tbn.html]{\TeX/book naruby}. Try to type `texdoc texbytopic` in your system. \ii OpTeX The \ulink[http://petr.olsak.net/optex]{\OpTeX/} manual supposes that the user already knows the basic principles of \TeX/ itself. If you are converting from \LaTeX/ to \OpTeX/ for example\fnote {Congratulations on your decision:-)} then you may welcome a summary document that presents these basic principles because \LaTeX/ manuals typically don't distinguish between \TeX/ features and features specially implemented by \LaTeX/ macros. I would like to express my special thanks to Barbara Beeton who read my text very carefully and suggested hundreds of language corrections and improvements and also discovered many of my real mistakes. Thanks to her, my text is better. But if there are any other mistakes then they are only mine and I'll be pleased if you send me a bug report in such case. \notoc\nonum\sec Table of contents \maketoc \outlines0 \sec[termi] Terminology The main principle of \TeX/ is that its input files can be a mix of the material which could be printed and \ii control~sequence {\em control sequences} which give a setting for built-in algorithms of \TeX/ or give a special message to \TeX/ what to do with the inputted material. Each control sequence (typically a word prefixed by a backslash) has its \ii meaning~of~control~sequence {\em meaning}. There are four types of meanings of control sequences: \begitems * the control sequence can be a \ii register {\em register}; this means it represents a variable which is able to keep a value. There are \ii primitive/register {\em primitive registers}. Their values influence behavior of built-in algorithms (e.g., \i hsize `\hsize`, \i parindent `\parindent`, \i hyphenpenalty `\hyphenpenalty`). On the other hand \ii declared/register {\em declared registers} are used by macros (e.g., \i medskipamount `\medskipamount` used in plain \TeX/ or {\doda\i ttindent `\ttindent`} used by \ii OpTeX \OpTeX/). * the control sequence can be a \ii primitive/command {\em primitive command}, which runs a built-in algorithm (e.g., \i def `\def` declares a macro, \i halign `\halign` runs the algorithm for tables, \i hbox `\hbox` creates a box in typesetting output). * the control sequence can be a \ii character/constant {\em character constant} (declared by \i chardef `\chardef` or \i mathchardef `\mathchardef` primitive command) or a \ii character/equivalent {\em character equivalent} (declared by \i let `\let\sequence=<character>`) or a {\em font selector} (declared by \i font `\font` primitive command). * the control sequence can be a \ii macro {\em macro}. When it is read, it is replaced by its \ii replacement/text {\em replacement text} in the input queue. If there are more macros in the replacement text, all macros are replaced. This is called the \ii expansion/process {\em expansion process} which ends when only printable text, primitive commands (listed in section~\ref[main]), registers (section~\ref[reg]), character constants, or font selectors remain. \enditems Example. When \TeX/ reads: \begtt \def\TeX{T\kern-.1667em\lower.5ex\hbox{E}\kern-.125emX} \endtt in a macro file, then the `\def` primitive command saves the information that {\doda\i TeX `\TeX`} is a control sequence with meaning \"macro", the replacement text is declared here, and it is a mix of a material to be typeset: `T`, `E` and `X` and primitive commands \i kern `\kern`, \i lower `\lower`, \i hbox `\hbox` with their parameters in given syntax. Each primitive command has a declared syntax; for example, `\kern` must be followed by a dimension specification in the format \"decimal number followed by a unit". More about this primitive syntax is in sections~\ref[reg], \ref[expand] and~\ref[main]. When a control sequence `\TeX` with meaning \"macro" occurs in the input stream, then it is \ii expansion {\em expanded} to its replacement text, i.e.\ the sequence of typesetting material and primitive commands. The `\TeX` macro expands to `T\kern-.1667em\lower.5ex\hbox{E}\kern-.125emX` and the logo \TeX/ is printed as a result of this processing. None of the control sequences have their definitive meaning. A control sequence could change its meaning by re-defining it as a new macro (using `\def`), redeclaring it as an arbitrary object in \TeX/ (using \i let `\let`), etc. When you re-define a primitive control sequence then the access to its value or built-in algorithm is lost. This is a reason why \ii OpTeX \OpTeX/ macros duplicate all primitive sequences (\i hbox `\hbox` and `\_hbox`) with the same meaning and use only \"private" control sequences (prefixed by `_`). So, a user can re-define `\hbox` without the loss of the primitive command `\_hbox`. \sec Formats, engines \TeX/ is able to start without any macros preloaded in the so-called \ii ini-TeX/state {\em ini-\TeX/ state} (the `-ini` option on the command line must be used). It already knows only primitive registers and primitive commands at this state.\fnote {Roughly speaking, if you know all these primitive objects (about 300 in classical \TeX/, 700 in Lua\TeX/) and the syntax of all these primitive commands and all the built-in algorithms, then you know all about \TeX. But starting to produce ordinary documents from this primitive level without macro support is nearly impossible.} When ini-\TeX/ reads macro files then new control sequences are declared as macros, declared registers, character constants or font selectors. The primitive command \i dump `\dump` saves the binary image of the \TeX/ memory (with newly declared control sequences) to the \ii format,format/file {\em format file} (`.fmt` extension). The original intention of existing format files was to prepare a collection of macro declarations and register settings, to load default fonts, and to dump this information to a file for later use. Such a collection typically declares macros for the markup of documents and for typesetting design. This is the reason why we call these files {\em format files}: they give a format of documents on the output side and declare markup rules for document source files. When \TeX/ is started without the `-ini` option, it tries to load a prepared format file into its memory and to continue with reading more macros or a real document (or both). The starting point is at the place where \i dump `\dump` was processed during the ini-\TeX/ state. If the format file is not specified explicitly (by `-fmt` option on the command line) then \TeX/ tries to read the format file with the same name which is used for running \TeX. For example `tex document` runs \TeX, it loads the format `tex.fmt` and reads the `document.tex`. Or `latex document` runs \TeX, it loads the format `latex.fmt` and reads the `document.tex`. The `tex.fmt` is the format file dumped when \ii plain~TeX/macros {\em plain \TeX/ macros}\fnote {Plain \TeX/ macros were made by \iindex {Knuth,/Donald}Donald Knuth, the author of \TeX. It is a set of basic macros and settings which is used (more or less) as a subset of all other macro packages.} were read, and `latex.fmt` is the format file dumped when \ii LaTeX/macros {\em \LaTeX/ macros} were read. This is typically done when a \TeX/ distribution is installed without any user intervention. So, the user can run `tex document` or `latex document` without worry that these typical format files exist. From this point of view, \LaTeX/ is nothing more than a format of \TeX/, i.e.~a collection of macro declarations and register settings. A typical \TeX/ distribution has four common \ii TeX/engines {\em \TeX/ engines}, i.e.~programs. They implement classical \TeX/ algorithms with various extensions: \begitems * \TeX/ -- only classical \TeX/ algorithms by Donald Knuth, * \ii pdfTeX pdf\TeX/ -- an extension supporting PDF output directly and micro-typographical features, * \ii XeTeX \XeTeX/ -- an extension supporting Unicode and PDF output, * \ii luaTeX Lua\TeX/ -- an extension supporting Lua programming, Unicode, micro-typographical features and PDF output. \enditems Each of them is able to run in ini-\TeX/ state or with a format file. For example the command `luatex -ini macros.ini` starts Lua\TeX/ at ini-\TeX/ state, reads the `macros.ini` file and the final `\dump` command is supposed here to create a format `macros.fmt`. Then a user can use the command `luatex -fmt macros document` to load `macros.fmt` and process the `document.tex`. Or the command `luatex document` processes Lua\TeX/ with `document.tex` and with `luatex.fmt` which is a little extension of plain \TeX/ macros. Another example: `lualatex document` runs Lua\TeX/ with `lualatex.fmt`. It is a format with \LaTeX/ macros for Lua\TeX/ engine. Final example: `optex document` runs Lua\TeX/ with `optex.fmt` which is a format with \ii OpTeX \ulink[http://petr.olsak.net/optex]{\OpTeX/ macros}. \sec Searching data If \TeX/ needs to read something from the file system (for example the primitive command \i input `\input<file name>` or \i font `\font<font selector>=<file name>` is used) then the rule \"first wins" is applied. \TeX/ looks at the current directory first or somewhere in the \TeX/ installation second. The behavior in the second step depends on the used \TeX/ distribution. For example \ii TeXlive \ulink[https://www.tug.org/texlive]{\TeX/live} programs are linked with a \ii kpathsea {\em kpathsea} library and they do the following: Search for the given file in the current directory, then in the \ii texmf~tree \code{~/texmf} tree (data are saved by the user here), then in the `texmf-local` tree (data are saved by the system administrator here; they are not removed when the \TeX/ distribution is upgraded), then in `texmf-var` tree (data are saved automatically by programs from the \TeX/ distribution here), and then in the `texmf-dist` tree (data from the \TeX live distribution). Each directory tree can be divided into sub-trees: first level `tex`, `fonts`, `doc`, etc.; the second level is divided by \TeX/ engines or font types, etc.; more levels are typically organized to keep clarity. New files in the current directory or in the \code{~/texmf} tree are found without doing anything more, but new files in other places have to be registered by the `texhash` program (\TeX/ distributions do this automatically during their installation). \sec Processing the input The lines from input files are first transformed by the \ii tokenizer {\em tokenizer}. It reads input lines and generates a sequence of tokens. These are the main goals of the tokenizer: \begitems * It converts each control sequence to a single token characterized by its name. * Other input material is tokenized as \"one token per character". * A continuous sequence of multiple spaces is transformed into one space token. * The end of the line is transformed into a space token, so that paragraph text can continue on the next input line and one space token is added between the last word on the previous line and the first word on the next line. * The comment character `%` is ignored and all the text after it to the end of line is ignored too. No space is generated at the end of this line. * Spaces from the begining of each line are ignored. Thus, you can use arbitrary indentation in your source file without changing the result. * Each empty line (or line with only spaces) is transformed to the token \i par `\par`. This token has primitive meaning: \"finalize the current paragraph". This implies the general rule in \TeX/ source files: paragraphs are terminated by empty lines. \enditems The behavior of the tokenizer is not definitive. The tokenizer works with a table of category codes. Any change of category codes of characters (done by the primitive command \i catcode {\let\,=\relax`\catcode`\code{`}`\<character>=<code>`}) influences tokenizer processing. For example, the verbatim environment is declared using setting all characters to normal meaning. By default, there are the following characters with special meaning. The tokenizer converts them or sets them as special tokens used in syntactic rules in \TeX/ later. The corresponding category codes are mentioned here as an index of the character. \begitems * `\`\c0\quad -- starts completion of a control sequence by the tokenizer. * `{`\c1 and `}`\c2\quad -- open and close group or have special syntactic meaning. The main syntactic rule is: each subsequence of tokens treated by macros or primitive commands must have these pairs of tokens balanced. There is no exception. The tokenizer treats them as special tokens with meaning \"opening character\c1" and \"closing character\c2". * `%`\c{14}\quad -- comment character, removed by the tokenizer, along with everything that follows it on the line. * `$`\c3, `&`\c4, `#`\c6, `^`\c7, `_`\c8, `~`\c{13}\quad -- tokenizer treats them as a special tokens with meaning: \ii math/mode/selector \"math-mode selector\c3", \ii table/separator \"table separator\c4", \ii parameter/prefix \"parameter prefix for macros\c6", \ii superscript/prefix \"superscript prefix in math\c7", \ii subscript/prefix \"subscript prefix in math\c8", \ii active/character \"active character\c{13}" (the active character `~` is defined as no-breakable space in all typical formats). * Letters and other characters are tokenized as \"letter character\c{11}" or \"other character\c{12}". \enditems If you need to print these special characters you can use \ii -percent,-at,-dollar,-hash `\%`, `\&`, `\$`, `\#` or `\_`. These five control sequences are declared as \"print this character" in all typical \TeX/ formats. Another possibility is to use a verbatim environment (it depends on the used format). Last alternative: you can use \i csstring {\let\,=\relax`\csstring\<character>`} in Lua\TeX/, because it has the primitive command `\csstring` which converts `\<character>` to `<character>`\c{12}. The \ii active/character \"active character\c{13}" can be declared by \i catcode `\catcode`\code{`}`\<character>=13`. Such a `<character>` behaves like a control sequence. For example, you can define it by \i def `\def<character>{...}` and use this `<character>` as a macro. If the term `<control sequence>` is used in syntactical rules in this document then it means a real control sequence or an active character. Each control sequence is built by the tokenizer starting from `\`\c0. Its name is a continuous sequence of letters\c{11} finalized by the first non-letter. Note that \OpTeX/ sets `_` as letter\c{11}, thus control sequence names can include this character. \LaTeX/ sets the `@` as letter\c{11} when reading styles and macro files. You can look to such files and you will see many such characters inside private control sequence names declared by \LaTeX/ macros. If the first character after `\`\c0 is non-letter (i.e.\ `<something>`\c{\ne11}), then the control sequence is finalized with only this character in its name. So called \ii one~character/control~sequence {\em one-character control sequence} is created. Other control sequences are \ii multiletter/control~sequence {\em multiletter control sequences}. Spaces {\char9251}\c{10} after multi-letter control sequences are ignored, so the space can be used as a terminating character of the control sequence. Other characters used immediately after a control sequence are not ignored. So `\TeX !` and `\TeX!` gives the same result: the control sequence `\TeX` followed immediately by `!`\c{12}. The tokenizer's output (a sequence of tokens) goes to the \ii expand/processor {\em expand processor} and its output goes to the \ii main/processor {\em main processor} of \TeX. The expand processor performs expansions of macros or a primitive command which is working at the expand processor level. See a summary of such commands in section~\ref[expand]. The main processor performs assignment of registers, declares macros by the \i def `\def` primitive command, and runs all primitive commands at the main processor level. Moreover, it creates the typesetting output as described in the next section. The very important difference between \TeX/ and other programs is that there are no strings, only sequences of tokens. We can return to the example `\def\TeX{...}` above in section~\ref[termi]. The token \i def `\def` is a control sequence with meaning \"declare a macro". It gets the following token \i TeX `\TeX` and declares it as a macro with replacement text, which is the sequence of tokens: \medskip \bgroup \def\b#1{\kern1pt\lower2ex\llap{\c{#1}\kern-1.5pt}} \def`#1`#2,{\frame{\strut\tt\Blue\string#1}\b{#2}}\catcode`\{=12 \catcode`\}=12 \indent `T`11, `\kern`, `-`12, `.`12, `1`12, `6`12, `6`12, `7`12, `e`11, `m`11, `\lower`, `.`12, `5`12, `e`11, `x`11, `\hbox`, `{`1, `E`11, `}`2, `\kern`, `-`12, `.`12, `1`12, `2`12, `5`12, `e`11, `m`11, `X`11, \egroup \medskip If you are thinking like \TeX/ then you must forget the term \"string" because all texts in \TeX/ are preprocessed by the tokenizer when input lines are read and only sequences of tokens are manipulated inside \TeX/. The tokenizer converts two `^`\c7`^`\c7 characters followed by an ASCII uppercase letter to the Ctrl-letter ASCII code. For example `^^M` is Ctrl-M (carriage return). It converts two `^`\c7`^`\c7 followed by two hexadecimal digits (`0123456789abcdef`) to a one-byte code, for example, `^^0d` is Ctrl-M too because it has code 13. Moreover, the tokenizer of \XeTeX/ or Lua\TeX/ converts `^`\c7`^`\c7`^`\c7`^`\c7 followed by four hexadecimal digits or `^`\c7`^`\c7`^`\c7`^`\c7`^`\c7`^`\c7 followed by six hexadecimal digits to one character with a given Unicode. \sec Vertical and horizontal modes When the main processor creates the typesetting output, it alternates between vertical and horizontal mode. It starts in \ii vertical/mode,@ {\em vertical mode}: all materials are put vertically below in this mode. For example \i hbox `\hbox{a}\hbox{b}\hbox{c}` creates a above b above c in vertical mode. If something is incompatible with the vertical mode principle --- a special command working only in horizontal mode or a character itself --- then the main processor switches to \ii horizontal/mode,@ {\em horizontal mode}: it opens an unlimited horizontal data row for typesetting material and puts material next to each other. For example \i hbox `a\hbox{b}\hbox{c}` opens horizontal mode due to \"a character itself" `a` and creates abc in horizontal mode. When an empty line is scanned, the tokenizer creates a \i par `\par` token here and if the main processor is in horizontal mode, the `\par` command finalizes the paragraph. More exactly it returns to vertical mode, it breaks the horizontal data row filled in previous horizontal mode to parts with the \i hsize `\hsize` width. These parts are completed as \ii box {\em boxes} and they are put one below another in vertical mode. So, a paragraph of \i hsize `\hsize` width is created. Repeatedly: if there is something incompatible with the current vertical mode (typically a character), then the horizontal mode is opened and all characters (and spaces between them) are put to the horizontal data row. When an empty line is scanned, then the `\par` command is started and the horizontal data row is broken into lines of \i hsize `\hsize` width and the next paragraph is completed. In vertical mode, the material is accumulated in a vertical data column called the \ii main/vertical/list {\em main vertical list}. If the height of this material is greater than \i vsize `\vsize` then its part with maximum `\vsize` height is completed as a \ii page/box {\em page box} and shipped to the \ii output/routine {\em output routine}. A programmer or designer can declare a design of pages using macros in the output routine: header, footer, pagination, the position of the main page box, etc. The output routine completes the main page box with other material declared in the output routine and the result is shipped out as one page of the document. The main processor continues in vertical mode with the rest of the unused material in the main vertical list. Then it can switch to horizontal mode if a character occurs, etc... The plain \TeX/ macro \i bye `\bye` (or primitive command \i end `\end`\fnote {\LaTeX/ format re-defines this primitive control sequence `\end` to another meaning which follows the logic of \LaTeX/'s markup rules.}) starts the last `\par` command, finalizes the last paragraph (if any), completes the last page box, sends it to the output routine, finalizes the last page in it, and \TeX/ is terminated. There are \ii internal/vertical/mode {\em internal vertical mode} and \ii internal/horizontal/mode {\em internal horizontal mode}. They are activated when the main processor is typesetting material inside \i vbox `\vbox{...}` or \i hbox `\hbox{...}` primitive commands. More about boxes is in sections~\ref[boxes] and~\ref[main]. Understanding of switching between modes is very important for \TeX/ users. There are primitive commands which are context dependent on the current mode. For example, the \i par `\par` primitive command (generated by an empty line) does nothing in vertical mode but it finalizes paragraph in horizontal mode and it causes an error in math mode. Or the \i kern `\kern` primitive command creates a vertical space in vertical mode or horizontal space in horizontal mode. The following primitive commands used in vertical mode start horizontal mode: the first character of a paragraph (most common situation) or \i indent `\indent`, \i noindent `\noindent`, \i hskip `\hskip` (and its alternatives), \i vrule `\vrule` and the plain \TeX/ macro \i leavevmode `\leavevmode`\fnote {The list is not exhaustive, but most important commands are mentioned.}. When horizontal mode is opened, an indentation of \i parindent `\parindent` width is included. The exception is only if horizontal mode is started by \i noindent `\noindent`; then the paragraph has no indentation. The following primitive commands used in horizontal mode finalize the paragraph and return to vertical mode: \i par `\par`, \i vskip `\vskip` (and its alternatives), \i hrule `\hrule`, \i end `\end` and the plain \TeX/ macro \i bye `\bye`. \sec Groups in \TeX/ Each assignment to registers, declaration macros or font selecting is local in groups. When the current group ends then the assignments made inside the group are forgotten and the values in effect before this group was opened are restored. The groups can be delimited by `{`\c1 and `}`\c2 pair or by \i begingroup `\begingroup` and \i endgroup `\endgroup` primitive commands or by \i bgroup `\bgroup` and \i egroup `\egroup` control sequences declared by plain \TeX. For example, plain \TeX/ declares the macros {\doda\i rm `\rm` (selects roman font), \i bf `\bf` (selects bold font) and \i it `\it`} (selects italics) and it initializes by \i rm `\rm` font. A user can write: \begtt The roman font is here {\it here is italics} and the roman font continues. \endtt % Not only fonts but all registers are set locally inside a group. The macro designer can declare a special environment with font selection and with more special typographical parameters in groups. The following example is a test of understanding vertical and horizontal modes switching. \begtt {\hsize=5cm This is the first paragraph which should be formatted to 5\,cm width.} But it is not true... \endtt % Why does the example above not create the paragraph with a 5\,cm width? The empty line \i par (`\par` command) is placed {\em after} the group is finished, so the \i hsize `\hsize` parameter has its previous value at the time when the paragraph is completed, not the value 5\,cm. The value of the \i hsize `\hsize` register\fnote {and about twenty other registers which declare the paragraph design} is used when the paragraph is completed, not at the beginning of the paragraph. This is the reason why macro programmers explicitly put a \i par `\par` command into macros before the local environment is finished by the end of the group. Our example should look like this: \begtt {\hsize=5cm This is the first ... to 5\,cm width.\par} \endtt \sec[boxes] Box, kern, penalty, glue You can look at one character, say the `y`. It is represented by three dimensions: \ii height height (above baseline), \ii depth depth (below baseline) and \ii width width. Suppose that there are more characters printed in horizontal mode and completed as a line of a paragraph. This line has its height equal to the maximum height of characters inside it, it has the depth equal to maximum depth of all characters inside it and it has its width. Such a sequence of characters encapsulated as one typesetting element with its height, depth and width is called a \ii box {\em box}. Boxes are placed next to each other (from left to right\fnote{There is an exception for special languages.}) in horizontal mode or one below another in vertical mode. The boxes can include individual characters or spaces or boxes. The boxes can include more boxes. Paragraph lines are boxes. The page box includes paragraph lines (boxes). The finalized page with a header, page box, pagination, etc., is a box and it is shipped out to the PDF page. Understanding boxes is necessary for macro programmers and designers. You can create an individual box by the primitive command \i hbox `\hbox{<horizontal material>}` or \i vbox `\vbox{<vertical material>}`. The `<horizontal material>` is completed in internal horizontal mode and `<vertical material>` in internal vertical mode. Both cases open a group, create the material in a specified mode and close the group, where all settings are local. The `<horizontal material>` can include individual characters, boxes, horizontal \ii glue {\em glues} or \ii kern {\em kerns}. \"Glue" is a special term for stretchable or shrinkable and possibly breakable spaces and \"kern" is a term used for fixed nonbreakable spaces. The `<vertical material>` can include boxes, vertical glues or kerns. No individual characters. If you put an individual character in vertical mode (for example in a \i vbox `\vbox`) then horizontal mode is opened. At the end of a \i vbox `\vbox`\fnote {before the \i vbox `\vbox` group is closed} or when the \i par `\par` command is invoked, the opened paragraph is finished (with current \i hsize `\hsize` width) and the resulting lines are vertically placed inside the \i vbox `\vbox`. The completed boxes are unbreakable and they are treated as a single object in the surrounding printed material. The line boxes of a paragraph have the fixed width \i hsize `\hsize`, so there must be something stretchable or shrinkable in order to get the desired fixed width of lines. Typically the spaces between words have this feature.\fnote {When the microtypographical feature \i pdfadjustspacing `\pdfadjustspacing` is activated, then not only spaces are stretchable and shrinkable but individual characters are slightly deformed (by an invisible amount) too.} These spaces have declared their \ii default/size/of~space {\em default size}, their \ii stretchability {\em stretchability} and their \ii shrinkability {\em shrinkability} in the font metric data of the currently used font. You can place such glue explicitly by the primitive command `\hskip`: \begtt \catcode`<=13 \hskip <default size> plus<stretchability> minus<shrinkability> for example: \hskip 10pt plus5pt minus2.5pt \endtt % This example places the glue with 10\,pt default size, stretchable to 15\,pt\fnote {It can be stretchable ad absurdum (more than 15\,pt) but with very considerable \ii badness {\em badness} calculated by \TeX/ whenever glues are stretched or shrunk.} and shrinkable to 7.5\,pt as its minimal size. All glues in one line are stretched or shrunk equally but with weights given from their stretchability/shrinkability values. You can do experiments of this feature if you say \i hbox `\hbox to<size>{...}`. Then the \i hbox `\hbox` is created with a given width. Probably, the glues inside this \i hbox `\hbox` must be stretched or shrunk. You can see in the log that the total \ii badness {\em badness} is calculated, it represents the amount of a \"force" used for all glue included in such an \i hbox `\hbox`. An infinitely stretchable (to an arbitrary positive value) or shrinkable (to an arbitrary negative value) glue can exist. This glue is stretched/shrunk and other glues with finite amounts of stretching or shrinking keep their default size in such case. You can put infinitely stretchable/\penalty0shrinkable \ii glue glue using the reserved unit \ii fil `fil` in an \i hskip `\hskip` command, for example the command \i hskip `\hskip 0pt plus 1fil` means zero default size but infinitely stretchable. There is a shortcut for such glue: \i hfil `\hfil`. When you type \z`\hbox to\hsize{\hfil <text>\hfil}` then the \z`<text>` is centered. But if the \z`<text>` is wider than \i hsize `\hsize` then \TeX/ reports an \ii overfull/box `overfull \hbox`. If you want to center a wide \z`<text>` too, you can use \i hss `\hss` instead of \i hfil `\hfil`. The \i hss `\hss` primitive command is equal to `\hskip 0pt plus1fil minus1fil`. The \z`<text>` printed by \z`\hbox to\hsize{\hss<text>\hss}` is now centered in its arbitrary size. A glue created with \ii fill `fill` stretchability or shrinkability (double ell) is infinitely more stretchable or shrinkable than glues with only a \ii fil `fil` unit. So, glues with \ii fill `fill` are stretched or shrunk and glues with only `fil` in the same box keep their default size. For example, a macro declares centering a \z`<text>` by \i hbox \z`\hbox to\hsize{\hss <text>\hss}` and a user can create the \z`<text>` in the form \i hfill \z`\hfill <real text>`. Then \z`<real text>` is printed flushed right because \i hfill `\hfill` is a shortcut to \i hskip `\hskip0pt plus1fill` and has greater priority than glues with only a \ii fil `fil` unit. Common usage is \i hbox \z`\hbox to0pt{<text>\hss}` or \z`\hbox to0pt{\hss<text>}`. The box with zero width is created and the text overlaps the adjacent text to the right (first example) or to the left (second example). \ii plain~TeX Plain \TeX/ declares macros for these cases: \i rlap \z`\rlap{<text>}` or \i llap \z`\llap{<text>}`. The last line of each paragraph is finalized by a glue of type \i hfil `\hfil` by default. When you write \i hfill `\hfill <object>` in vertical mode (`<object>` is something like a table, image or whatever else in the box) then `<object>` is flushed right, because the paragraph is started by the `\hfill` space but finalized only by \i hfil `\hfil` space. If you type \i noindent `\noindent\hfil <object>` then the <object> is centered. And putting only `<object>` places it to the left side because the common left side is the default placement rule in vertical mode. The same principles that apply to horizontal glues are also applicable to vertical modes where glues are created by \i vskip `\vskip` commands instead of `\hskip` commands. You can write \i vbox `\vbox to<size>{...}` and do experiments. When the paragraph breaking algorithm decides about the suitable breakpoints for creating lines with the desired width \i hsize `\hsize`, then each glue is a potentially breakable point. Each glue can be preceded by a \ii penalty {\em penalty} value (created by the \i penalty `\penalty` primitive) in the typical range $-10000$ to $10000$. The paragraph breaking algorithm gets a penalty if it decides to break line at the glue preceded by the given penalty value. If no penalty is declared for a given glue, then it is the same as a penalty equal to zero.\fnote {More precisely: the paragraph breaking algorithm or page breaking algorithm can break horizontal list to lines (or vertical list to pages) at penalties (then it gets the given penalty) or at glues (then the penalty is zero). The second case is possible only if no penalty nor glue precedes. The item where the list is broken (penalty or glue), is discarded and all immediatelly followed glues, penalties and kerns are discarded too. They are called \ii discardable/item {\em discardable items}}. The penalty value $10000$ or more means \"impossible to break". A negative penalty means a bonus for the paragraph breaking algorithm. The penalty $-10000$ or less means \"you must break here". The paragraph breaking algorithm tries to find an optimum of breakpoint positions concerning to all penalties, to all badnesses of all created lines and to many more values not mentioned here in this brief document. The analogous optimal breakpoint is found in vertical material when \TeX/ breaks it into pages. The concept \"box, penalty, glue" with the optimum-fit breaking algorithms makes \TeX/ unique among many other typesetting software. \sec Syntactic rules A primitive command can get its parameters written after it. These parameters must suit syntactic rules given for each primitive command. Some parameters are optional. For example \i hskip `\hskip<dimen> /plus<stretchability>. /minus<shrinkability>.` means that the parameter `<dimen>` must follow (it must suit syntactic rules for dimensions, see section \ref[reg]) then the optional parameter prefixed by keyword \ii plus `plus` can follow and then the optional parameter prefixed by \ii minus `minus` can follow. We denote the optional parameters by underline in this document. {\emergencystretch=2em\par} \ii keyword {\em Keywords} (typically prefixes to some parameters) may have optional spaces around them. The explicit expressions of numbers (i.e.\ `75`, `"4B`, \code{`K}; see section~\ref[reg]) should be terminated by one optional space which is not printed. This space can serve as a termination character which says that \"whole number is presented here; no more digits are expected". If the syntactic rule mentions the pair `{`, `}` then these characters are not definitive: other characters may be tokenized with this special meaning but it is not common. The text between this pair must be \ii balanced/text {\em balanced} with respect to this pair. For example the syntactic rule \i message \z`\message{<text>}` supposes that \z`<text>` must not be `ab{cd`, but `ab{c{}}d` is allowed for instance. By default, all parameters read by primitive commands are got from the input stream, tokenized and fully expanded by the expand processor. But sometimes, when \TeX/ reads parameters for a primitive command, the expand processor is deactivated. We denote these parameters by red color. For example, \i let `\let|<control sequence>=<token>;` means that these parameters processed by the `\let` command are not expanded. Whenever a syntactic rule mentions the \ii equal/sign `=` character (see the previous example with the \i let `\let` command), then this is the equal sign tokenized as a normal character and it is optional. The syntactic rule allows to omit it. Optional spaces are allowed around this equal sign. The concept of the optional parameters of primitive commands (terminated if something different from the keyword follows) may bring trouble if a macro programmer forgets to terminate an incomplete parameter text by the \x`\relax` command (`\relax` does nothing but it can terminate a list of optional parameters of the previous command). Suppose, for example, that `\mycoolspace` is defined by `\def\mycoolspace{\penalty42\hskip2mm}`. If a user writes `first\mycoolspace plus second` then \TeX/ reports the error `missing` `number,` `treated` `as` `zero` in the position of `s` character and appends: \code{<to be read again> s}. A user who is unfamiliar with \TeX/ primitive commands and their parameters is totally lost. The correct definition looks like: `\def\mycoolspace{\penalty42\hskip2mm\relax}`. \sec[def] Principles of macros Macros can be declared by the \i def `\def` primitive command (or `\edef`, `\gdef`, `\xdef` commands; see below). The syntax is `\def|<control sequence>/<parameters>.{<replacement text>};`. The `<parameters>` are a sequence of formal parameters of the declared macro written in the form `#1`, `#2`, etc. They must be numbered from one and incremented by one. The maximum number of declared parameters is nine. These parameters can be used in the `<replacement text>`. This specifies the place where the real parameter is positioned when the macro is expanded. For example: \begtt \def\test #1{here is "#1".} \test A % expands to: here is "A". \def\swap #1#2{#2#1} \swap AB % expands to: BA \test {param} % expands to: here is "param". \swap A{param} % expands to: paramA \endtt % Note that there are two possibilities of how to write real macro parameters when a macro is in use. The parameter is one token by default but if there is `{<something>}` then the parameter is `<something>`. The braces here are delimiters for the real parameter (no \TeX/ group is opened/closed here). The example above shows a declaration of \ii unseparated/parameter,@ {\em unseparated parameters}. The parameters were declared by `#1` or `#1#2` with no text appended to such a declaration. But there is another possibility. Each formal parameter can have a text appended in its declaration, so the general syntax of the declaration of formal parameters is \z`#1/<text1>.#2/<text2>.` etc. If such \z`<text>` is appended then we say that the parameter is \ii separated/parameter,@ {\em separated} or \ii delimited/parameter,@ {\em delimited} by text. The same delimiter must be used when the macro is in use. For example \begtt \def\Test #1#2..#3 {first "#1", second "#2", third "#3".} \Test ABC..DEF G % expands to: first "A", second "BC", third "DEF". % the letter G follows after expansion. \endtt % In the example above the `#1` parameter is unseparated (one token is read as a real parameter if the syntax \z`{<parameter>}` is not used). The `#2` parameter is delimited by two dots and the `#3` parameter is delimited by space. There may be a \z`<text0>` immediately before `#1` in the parameter declaration. This means that the declared macro must be used with the same \z`<text0>` immediately appended. If not, \TeX/ reports the error. The general rule for declaration of a macro with three parameters should be: \i def \z`\def|<control sequence>/<text0>.#1/<text1>.#2/<text2>.#3/<text3>.{<replacement text>};`. The rule \"everything must be balanced" is applied to separated parameters too. It means that `\Test AB{C..DEF G}.. H` from the example above reads `B{C..DEF G}` to the `#2` parameter and the `#3` parameter is empty because the space (the delimiter of `#3` parameter) immediately follows two dots. If the real parameter is in the form `{...}` then the outer braces are removed from the parameter. For example `\Test A{C..DEF G}.. H` reads `C..DEF G` to the `#2`. When reading an unseparated parameter, \TeX/ ignores spaces before first non-space token. Suppose for example `\def\m:#1{"#1"}`. Then both `\m:x` and `\m: x` print `"x"` and both `\m:{a b}` and `\m: {a b}` print `"a b"`. The separated parameter can bring a potential problem if the user forgets the delimiter or the delimiter is specified incorrectly. Then \TeX/ reports an error. This error is reported when the first \i par `\par` is scanned as part of the parameter (probably generated from an empty line). If you really want to scan as part of the parameter more paragraphs including `\par` between them, then you can use the \i long `\long` prefix before `\def`. For example `\long\def\scan#1\stop{...}` reads the parameter of the `\scan` macro up to the `\stop` control sequence, and this parameter can include more paragraphs. If the delimiter is missing when a \i long `\long` defined macro is processed, then \TeX/ reports an error at the end of the file. When a real parameter of a macro is scanned then the expand processor is deactivated. When the `<replacement text>` is processed then the expand processor works normally. This means that if parameters are used in the `<replacement text>`, then they are expanded here. If a macro declaration is used inside the `<replacement text>` of another macro then the number of `#` must be doubled for inner declaration. Example: \begtt \def\defmacro#1#2{% \def#1##1 ##2 {##1 says: #2 ##2.}% } \defmacro \hello {hello} % expands to \def\hello#1 #2 {#1 says: hello #2.} \defmacro \goodbye {good bye} \hello Jane Eric % expands to: Jane says: hello Eric. \goodbye Eric John % expands to: Eric says: good bye John. \endtt The exact implementation of the feature above: when \TeX/ reads macro body (during `\def`, `\edef`, `\gdef`, `\xdef`) then each double `#`\c6 is converted to single `#`\c6 and each (unconverted yet) single `#`\c6 followed by a digit is converted to an internal mark of future parameter. This mark is replaced by real prameter when the defined macro is used. This rule of conversion of macro body has one exception: `\edef{...\the\toks...}` keeps the toks content unexpanded and without conversion of hashes. And there exists a reverse conversion from internal marks to~`#`\c{12}<number> and from `#`\c6 to `#`\c{12}`#`\c{12} when \TeX/ writes macro body by \i meaning `\meaning` primitive. Note the `%` characters used in the `\defmacro` definition in the exmample above. They mask the end of lines. If you don't use them, then the space tokens are included here (generated by the tokenizer at the end of each line). The `<replacement text>` of `\defmacro` will be `<space>\def#1...{...}<space>` in such a case. Each usage of `\defmacro` generates two unwanted spaces. It is not a problem if `\defmacro` is used in the vertical mode because spaces are ignored in this mode. But if `\defmacro` is used in horizontal mode then these spaces are printed.\fnote {More precisely, they are transformed into horizontal glues used between words.} The macro declaration behaves as another assignment, so the information about such a declaration is lost if it is used in a group and the group is left. But you can use a \i global `\global` prefix before \i def `\def` or the primitive \i gdef `\gdef`. Then the assignment is global regardless of groups. When `\def` or `\gdef` is processed then `<replacement text>` is read with the deactivated expand processor. We have alternatives \i edef `\edef` (expanded def) and \i xdef `\xdef` (global expanded def) which read their `<replacement text>` expanded by the expand processor. The summary of `\def` syntax is: \begtt \catcode`\|=13 \catcode`\/=13 \catcode`\<=13 \Blue \def|<control sequence>/<parameters>.{<replacement text>}; % local assignment \gdef|<control sequence>/<parameters>.{<replacement text>}; % global assignment \edef|<control sequence>/<parameters>.{;<replacement text>} % local assignment \xdef|<control sequence>/<parameters>.{;<replacement text>} % global assignment \endtt If you set \i tracingmacros `\tracingmacros=2`, you can see in the log file how the macros are expanded. \sec Math modes The `$`\c3`<math text>$`\c3 specifies a math formula inside a line of the paragraph. It processes the `<math text>` in a group and in \ii internal/math/mode,math/mode/internal {\em internal math mode}. The `$`\c3`$`\c3`<math text>$`\c3`$`\c3 generates a separate line with math formula(s). It processes the `<math text>` in a group and in \ii display/math/mode,math/mode/display {\em display math mode}. The fonts in math mode are selected in a very specific manner which is independent of the current text font. Six different math objects are automatically detected in math mode: \x`\mathord` (normal material), \x`\mathop` (big operators), \x`\mathbin` (binary operators), \x`\mathrel` (relations), \x`\mathopen` (open brackets), \x`\mathclose` (close brackets), \x`\mathpunct` (punctuation). They can be processed in four styles \x`\displaystyle` (default in the display mode), \x`\textstyle` (default in the internal math mode), \x`\scriptstyle` (used for indexes or exponents, smaller text) and \x`\scriptscriptstyle` (used in indexes of indexes, even smaller text). The math typesetting algorithms were implemented in \TeX/ by its author with great care. All typographical traditions of math typesetting were taken into account. There are three chapters about math typesetting in his \TeX/book. Moreover, there is the detailed appendix G containing the exact specification of generating math formulae. This topic is unfortunately out of the scope of this short text. More about it can be found in \ulink[http://petr.olsak.net/ftp/olsak/optex/optex-math.pdf]{Typesetting Math with \OpTeX/} There is a good a piece of news: all formats (including \LaTeX/) take the default \TeX/ syntax for `<math text>`. So, \LaTeX/ manuals or \LaTeX/ documents serve a good source if you want to get to know the rules of math typesetting by \TeX. There is only one significant difference. Fractions are constructed at the primitive level by the \x`\over` primitive: `{<numerator>\over<denominator>}` but \LaTeX/ uses a macro {\doda\x`\frac`} in the syntax `\frac{<numerator>}{<denominator>}`. Plain \TeX/ users (including the author of \TeX/) prefer the syntax which follows the principle \"how a human reads the formula". On the other hand, the \x`\frac` syntax is derived from machine languages. You can define the \x`\frac` macro by `\def\frac#1#2{{#1\over#2}}` if you want. \sec[reg] Registers \ii register There are four types of registers used in \TeX: \begitems \def\!{\kern-1pt} * \ii counter/type/register {\em Counters}; their values are integer numbers. Counters are declared by \i newcount `\newcount|<register>;`\fnote {The declarators \x`\newcount`, \x`\newdimen`, \x`\newskip` and \x`\newtoks` are plain \TeX/ macros used in all known \TeX/ formats. They provide `<address>` allocation and use the `\count<address>`\!, `\dimen<address>`\!, `\skip<address>` and `\toks<address>` \TeX/ registers. The \x`\countdef`, \x`\dimendef`, \x`\skipdef` and \x`\toksdef` primitive commands are used internally.} or they are primitive registers (\x`\linepenalty` for example). \TeX/ interprets primitive commands which represent an integer from an internal table as counter type register too (examples: \x`\catcode`\code{`A}, \x`\lccode`\code{`A}). * \ii dimen/type/register {\em Dimen type}; their values are dimensions. They are declared by \i newdimen `\newdimen|<register>;` or they are primitive registers (\x`\hsize`, for example). \TeX/ interprets primitive commands which represent a dimension value as dimen type register too (example: \i wd `\wd0`). * \ii glue/type/register {\em Glue type}; their values are triples like in general \x`\hskip` parameters. They can be declared by \i newskip `\newskip|<register>;` or they are primitive registers (\x`\abovedisplayskip` for example).\fnote {Very similar {muglue type} for math glues exists too but it is not described in this text.} * \ii token/type/register {\em Token lists}; their values are sequences of tokens. They are declared by \i newtoks `\newtoks|<register>;` or they are primitive registers (\x`\everypar` for example). \enditems The following example shows how registers are declared, how a value is saved to the register, and how to print the value of the register. \tmpnum=42 \tmpdim=-13cm \skip0=10mm plus 12mm minus1fil \toks0={abCd ef} \begtt \newcount \mynumber \newdimen \mydimen \newskip \myskip \newtoks \mytoks \mynumber = 42 \mydimen = -13cm \myskip = 10mm plus 12mm minus1fil \mytoks = {abCd ef} To print these values use the primitive command "the": \the\mynumber, \the\mydimen, \the\myskip, \the\mytoks. \bye \endtt % This example prints: To print these values use the primitive command "the": \the\tmpnum, \the\tmpdim, \the\skip0, \the\toks0. Note that the human readable dimensions are converted to typographical points~(pt). The general syntactic rule for storing values to registers is `<register>=<value>` where the equal sign is optional and it can be surrounded by optional spaces. Syntactic rules for each type of `<value>` depending on type of the register (i.e.\ `<number>`, `<dimen>`, `<skip>` and `<toks>`) follows. \begitems \let\_aboveliskip=\relax * The `<number>` could be \begitems \style N * a register of counter type; * a character constant declared by \x`\chardef` or \x`\mathchardef` primitive command. * an integer decimal number (with optional `+` or `-` prefixed) * {\let\,=\relax `"<hexa number>`} where `<hexa number>` can include digits `0123456789ABCDEF`\,; * {\let\,=\relax`'<octal number>`} where `<octal number>` can include digits `01234567`\,; * {\let\,=\relax \code{`}`<character>`} (the prefix is the reverse single quote \code{`}). It returns the code of the `<character>`. Examples: \code{`}{`|A;`} or one-character control sequence \code{`}{`|\A;`}). Both examples represent the number 65. The Unicode of the character is taken here if Lua\TeX/ or \XeTeX/ is used; * \i numexpr `\numexpr<num. expression>`.\fnote {This is a feature of the $\varepsilon$\TeX/ extension. It is implemented in pdf\TeX, \XeTeX/ and Lua\TeX.} The `<num. expression>` uses operators `+`, `-`, `*` and \code{/} and brackets `(`, `)` in normal sense. The operands are `<number>`s. It is terminated by something incompatible with the syntactic rule of `<num. expression>` or by `\relax`. The `\relax` (if it is used as a separator) is removed. If the result is non-integer, then it is rounded (not truncated). \enditems The rules 3)--6) can be terminated by one optional space. * The `<dimen>` could be \begitems \style N * a register of dimen type or counter type; * a decimal number with an optional decimal point (and optional `+` or `-` prefixed) followed by `<dimen unit>`. The `<dimen unit>` is \ii pt `pt` (point)\fnote {1\,pt = 1/72.27\,in $\doteq$ 0.35\,mm\,;\ 1\,pc = 12\,pt\,;\ 1\,bp = 1/72\,in\,;\ 1\,dd $\doteq$ 1.07\,pt\,;\ 1\,cc = 12\,dd\,;\ 1\,sp = $2^{-16}$\,pt = \TeX/ accuracy.} or \ii mm `mm` or \ii cm `cm` or \ii in `in` or \ii bp `bp` (big point) or \ii dd `dd` (Didot point) or \ii pc `pc` (pica) or \ii cc `cc` (cicero) or \ii sp `sp` (scaled point) or \ii em `em` (quad of current font) or \ii ex `ex` (ex~height of current font) or a register of dimen type; * \i dimexpr `\dimexpr<dimen expression>`. The `<dimen expression>` uses operators `+`, `-`, `*` and \code{/} and brackets `(`, `)` in their normal sense. The operands of `+` and `-` are `<dimen>`s, the operators of `*` or \code{/} are the pair `<dimen>` and `<number>` (in this order). The `<dimen expression>` is terminated by something incompatible with the syntactic rule of `<dimen expression>` or by `\relax`. The `\relax` (if it is used as a separator) is removed. \enditems The rule 2) can be terminated by one optional space. * The `<skip>` could be: \begitems \style - * a register of glue type or dimen type or counter type; * `<dimen>/plus<generalized dimen>. /minus<generalized dimen>.`. The `<generalized dimen>` is the same as `<dimen>`, but normal `<dimen unit>` or pseudo-unit `fil` or `fill` or `filll` can be used. \enditems * The `<toks>` could be \begitems \style - * `/<expandafters>.`\z`{|<text>};`. The `<expandafters>` is typically a sequence of `\expandafter` primitive commands (zero or more). The \z`<text>` is scanned without expansion but the exception can be given by `<expandafters>`. \enditems \enditems \removelastskip The main processor reads input tokens (from the output of activated or deactivated expand processor) in two contexts: \ii do/something/context,context/do/something {\em do something} or {\em read parameters}. By default it is in the context {\em do something}. When a primitive which allows parameters is read, the main processor reads the parameters in the context \ii read/parameters/context,context/read/parameters {\em read parameters}. Whenever the main processor reads a register in the context {\em do something} it assumes that an assignment of a value to the register is declared here. The following text (equal sign and `<value>`) is read in the context {\em read parameters}. If the following text isn't compliant to the appropriate syntactic rule, \TeX/ reports an error. Examples of register manipulations: \begtt \newcount\mynumber \newdimen\mydimen \newdimen\myskip \hsize = .7\hsize % see the rule for <dimen>, unit could be a register \hoffset = \dimexpr 10mm - (\parindent + 1in) \relax % usage of \dimexpr \myskip = 10pt plus15pt minus 3pt \mydimen = \myskip % the information "plus15pt minus 3pt" is lost \mynumber = \mydimen % \mynumber = 10*2^16 because \mydimen = 10*2^16 sp \endtt % Each dimension is saved internally as an integer multiple of the `sp` unit in \TeX. When we need a conversion `<dimen>` $\to$ `<number>`, then simply the internal unit `sp` is omitted. The summary of most commonly used primitive registers including their default value given by plain \TeX/ follows. \let\makedest=\makedestactive \begitems \rightskip=0pt plus1fil * \y`\hsize=6.5in`, \y`\vsize=8.9in` are paragraph width and page height. * \y`\hoffset=0pt`, \y`\voffset=0pt` give left margin and top margin of the page. They are calculated from the \ii page/origin {\em page origin} which is defined by coordinates \y`\pdfvorigin=1in` and \y`\pdfhorigin=1in` measured from left upper corner of the page. * \y`\parindent=20pt` is the indentation of the first line of each paragraph. * \y`\parfillskip=0pt plus 1fil` is horizontal glue added to the last line of the paragraph. * \y`\leftskip=0pt`, \y`\rightskip=0pt`. Glues added to each line in the paragraph from the left and the right side. If the stretchability is declared here, then the paragraph is ragged left/right. * \y`\parskip=0pt plus 1pt` is the vertical space between paragraphs. * \y`\baselineskip=12pt`, \y`\lineskiplimit=0pt`, \y`\lineskip=1pt`. \ii baselineskiprule The {\em `\baselineskip` rule} says: Two consecutive lines in the vertical list have the baseline distance given by \x`\baselineskip` by default. The appropriate real glue is inserted between the lines. But if this real glue (between boxes) is less than \x`\lineskiplimit` then \x`\lineskip` is inserted between the boxes instead. * \y`\topskip=10pt` is the distance between the top of the page box and the baseline of the first line. \y`\splittopskip=10pt` is the same for a box remainded after \q`\vsplit`. * \y`\linepenalty=10`, \y`\hyphenpenalty=50`, \y`\exhyphenpenalty=50`, \y`\binoppenalty=700`, \y`\relpenalty=500`, \y`\clubpenalty=150`, \y`\widowpenalty=150`, \y`\displaywidowpenalty=50`, \y`\brokenpenalty=100`, \y`\predisplaypenalty=10000`, \y`\postdisplaypenalty=0`, \y`\interlinepenalty=0`, \y`\floatingpenalty=0`, \y`\outputpenalty=0`. These penalties apply to various places in the vertical or horizontal list. Most important are \x`\clubpenalty` (inserted below the first line of a paragraph) and \x`\widowpenalty` (inserted before the last line of a paragraph). Typographical rules often demand us to set these registers to 10000 (no page break is allowed here). * \y`\looseness=0` allows us to create of a \"suboptimal" paragraph. The paragraph building algorithm tries to build the paragraph with \x`\looseness` lines more than the optimal solution. If the {\noda\x`\tolerance`} does not have a sufficiently large value then this setting is simply ignored. It is reset to zero after each paragraph is completed. * \y`\spaceskip=0pt`, \y`\xspaceskip=0pt`. If non-negative they are used as glues between words. Default values are read from the font metric data of the current font. * \y`\pretolerance=100`, \y`\tolerance=200`, \y`\emergencystretch=0pt` \y`\doublehyphendemerits=10000`, \y`\finalhyphendemerits=5000`, \y`\adjdemerits=10000`, \y`\hfuzz=0.1pt`, \y`\vfuzz=0.1pt` are parameters for the paragraph building algorithm (not described here in detail). * \y`\uchyph=1`, if it is positive, then words with capital first letter can be hyphenated. * \y`\hbadness=1000`, \y`\vbadness=1000`. \TeX/ reports a warning about \iid badness on the terminal and to the log file if it is greater than these values. The warning has the form \ii underfull/box `underfull` `\hbox` or `underfull \vbox`. The value `100` means that the `plus` limit for glues is reached. * \y`\tracingonline=0`, \y`\tracingmacros=0`, \y`\tracingstats=0`, \y`\tracingparagraphs=0`, \y`\tracingpages=0`, \y`\tracingoutput=0`, \y`\tracinglostchars=1`, \y`\tracingcommands=0`, \y`\tracingrestores=0`, \y`\tracingscantokens=0`, \y`\tracingifs=0`, \y`\tracinggroups=0`, \y`\tracingassigns=0`. If these registers have positive values then \TeX/ reports details about the processing of built-in algorithms to the log file. If \i tracingonline `\tracingonline>0` then the same output is shown on the terminal. * \y`\showboxbreadth=5`, \y`\showboxdepth=3`, \y`\errorcontextlines=5`. The amount of information shown when boxes are traced to the log file or an error is reported. * \y`\language=0`. \TeX/ is able to load more hyphenation patterns for more languages. This register points to the index of currently used hyphenation patterns. Zero means English. * \y`\lefthyphenmin=2`, \y`\righthyphenmin=3`. Maximum letters left or right in hyphenated words. * \y`\defaulthyphenchar=`\code{`}`\-`. This character is used when words are hyphenated. * \y`\globaldefs=0`. If it is positive then all settings are global. * \y`\hangafter=1`, \y`\hangindent=0pt`. If \x`\hangindent` is positive, then after \x`\hangafter` lines all following lines are indented. Negative/positive values of \x`\hangindent` or \x`\hangafter` applies indentation from left or right and from the top or bottom of the paragraph. The \x`\hangindent` is set to 0 after each paragraph. * \y`\mag=1000`. Magnification factor of all used dimensions. The value 1000 means 1:1. * \y`\escapechar=`\code{`}`\\` use this character in the `\string` primitive. * \y`\newlinechar=-1`. If positive, this character is interpreted as the end of the line when printing to the log or by the {\noda\x`\write`} primitive command. * \y`\endlinechar=`\code{`}`^^M`. This character is appended to the end of each input line. The tokenizer converts it (the Ctrl-M character) to the space token. * \y`\time=now`, \y`\day=now`, \y`\month=now`, \y`\year=now`. The values about current time/date are set here when \TeX/ starts to process the document. The \x`\time` counts minutes after midnight. * \y`\prevdepth=*` includes the depth of the last box in vertical mode. * \y`\prevgraph=*` includes the number of lines of the paragraph when `\par` finishes. * \y`\overfullrule=5pt`. A rectangle to this width is appended after each \ii overfull/box overfull `\hbox`. * \y`\mathsurround=0pt` is the space inserted around a formula in internal math mode. * \y`\displaywidth=*` includes the width of the line with display formula. * \y`\abovedisplayskip=12pt plus3pt minus9pt`, \y`\abovedisplayshortskip=0pt plus3pt`, \y`\belowdisplayskip=12pt plus3pt minus9pt`, \y`\belowdisplayshortskip=7pt plus3pt minus 4pt`. These spaces are inserted above and below a formula generated in math display mode. * \y`\thinmuskip=3mu`, \y`\medmuskip=4mu plus 2mu minus 4mu`, \y`\thickmuskip=5mu plus 5mu`. These spaces are inserted after comma, around binary operators, and around relations in math mode. The special math unit {\tt 1mu} is $(1/18)${\tt em}. * \y`\tabskip=0pt` is used by the `\halign` primitive command for creating tables. * \y`\output={\plainoutput}`, \y`\everypar={}`, \y`\everymath={}` \y`\everydisplay={}`, \y`\everyhbox={}` \y`\everyvbox={}` \y`\everycr={}`, \*\y`\everyeof={}`, \y`\everyjob={}`. These token lists are processed when an algorithm of \TeX/ reaches a corresponding situations respectively: opens output routine, paragraph, internal math mode, display math mode, {\noda\x`\vbox`, \x`\hbox`}, is at the end of a line in a table, at the end of an input file, or starts the job. * \y`\maxdepth=4pt`, \y`\boxmaxdepth=\maxdimen`, \y`\splitmaxdepth=\maxdimen` are maximal pagebox / ordinal `\vbox` / `\vsplit`ted box depth. If exeeds, baseline si shifted down. * \y`\delimiterfactor=901`, \y`\delimitershortfall=5pt` are parameters for calculating size of math delimiters. \y`\nulldelimiterspace=1.2pt` is empty delimiter horizontal space. * \y`\defaultskewchar=-1` sets a character \x`\skewchar` in font used for positioning math accents. * \x`\fontdimen``<number><font selector>` enables access to various values of given font. * \y`\pdfpagewidth=210mm`, \y`\pdfpageheight=297mm` are PDF page dimensions (implemented in pdf\TeX/ and its successors). \enditems \sec[expand] Expandable primitive commands These commands are processed like macros, i.e.\ they expand to another sequence of tokens. Notes about notation are in this and the following sections. If the documented command is from the $\varepsilon$\TeX{} extension (i.e.\ implemented in pdf\TeX, \XeTeX/ and Lua\TeX) then one * is prefixed. If it is from the pdf\TeX/ extension (implemented in \XeTeX/ and Lua\TeX/ too) then two ** are prefixed. If it is~a~Lua\TeX/ only command then three *** are prefixed. \begitems * \i string `\string|<control sequence>;` expands to \"the \x`\escapechar`" followed by the name of the control sequence. \i escapechar \"The \x`\escapechar`" means a character with code equal to \x`\escapechar` or nothing if its value is out of range of character codes. All characters of the output are \"other characters\c{12}", only spaces (if any exist) are kept as space tokens {\char9251}\c{10}. * \*\*\*\i csstring `\csstring|<control sequence>;` works like`\string` but without `\escapechar`. * \*\i detokenize `\detokenize/<expandafters>.`\z`{|<text>};` re-tokenizes all tokens in the text. Control sequences used in \z`<text>` are re-tokenized like the `\string` primitive, spaces are tokens {\char9251}\c{10}, and all other tokens are set as \"other characters\c{12}". * \i the `\the<register>` expands to the value of the register. Examples appear in the previous section. The output is tokenized like of `\detokenize`. The exception is `\the<tokens register>`: the output is the value of the `<tokens register>` without re-tokenizing and the expand processor does not expand this output in {\noda`\edef`, `\write`, `\message`}, etc., arguments. * \i scantokens `\scantokens/<expandafters>.`\z`{|<text>};` re-tokenizes \z`<text>` using the actual tokenizer setting. The behavior is the same as when writing \z`<text>` to a virtual file and reading this file immediately. * \*\*\*\i scantextokens `\scantextokens/<expandafters>.`\z`{|<text>};` is the same as `\scantokens` but removes problems with end-of-virtual-file. * \i meaning `\meaning|<token>;` expands to the meaning of the `<token>`. The text is tokenized like the `\detokenize` output. * \i csname \i endcsname \z`\csname<text>\endcsname` creates a control sequence with name \z`<text>`. If it is not already defined, then it gets the `\relax` meaning. For example `\csname TeX\endcsname` is the same as `\TeX`. The \z`<text>` must be expandable to characters only. Non-expandable control sequences (a primitive command at the main processor level, a register, a character constant, a font selector) are disallowed here. \TeX/ reports the error `missing \endcsname` when this rule isn't compliant. Example: `\csname foo:\the\mynumber\endcsname` expands to control sequence `\foo:42` if the `\mynumber` is a register with the value 42. Another example: a macro programmer should implement a key/value dictionary using this primitive: \begtt \def\keyval #1 #2 {\expandafter\def\csname dict:#1\endcsname{#2}} \def\value #1 {\csname dict:#1\endcsname} \keyval Peter 21 % key=Peter, value=21, saved to the dictionary % it does \def\dict:Peter{21} \value Peter % expands to \dict:Peter which expands to 21. \endtt * \*\*\*\i lastnamedcs `\lastnamedcs` is the last control sequence created by `\csname ...\endcsname`. * \i expandafter \z`\expandafter|<token 1><token 2>;` does the transformation \z`<token 1><expanded token 2>`. Then \TeX/ processes `<token 1>` followed by `<expanded token 2>`. If `<token 2>` isn't expandable then `\expandafter` silently does nothing. The \z`<expanded token2>` is only the first level of expansion. For example, a macro is transformed to its `<replacement text>` but without expansion of `<replacement text>` at this time. Or the `\csname...\endcsname` pair creates a control sequence but does not expand it at this time. % If \z`<token 2>` is not expandable then \x`\expandafter` silently does % nothing. A typical usage: the `<token 1>` is a macro or a \TeX/ primitive which needs `<expanded token 2>` as its parameter. The example above (the `\keyval` macro) shows this case. We need not define `\csname` by `\def`; we want to define a `\dict:key`. The \x`\expandafter` helps here. The \z`<token 2>` can be another \x`\expandafter`. We can see \x`\expandafter` chains in many macro files. For example `\expandafter\A\expandafter\B\expandafter\C\D` is processed as \z`\A \B \C <expanded>\D`. The `/<expandafters>.`\z`{|<text>};` syntax rule enables us to prepare \z`|<text>;` by `\expandafter`(s). For example \i detokenize `\detokenize{\macro}` expands to `\`\c{12}`m`\c{12}`a`\c{12}`c`\c{12}`r`\c{12}`o`\c{12}{\Blue\char9251}\c{10}. If you need to detokenize the `<replacement text>` of the `\macro` then use `\detokenize\expandafter{\macro}`. Not only `\expandafter`s should be here. The expand processor does full expansion here until an opening brace `{`\c{1} is found. * \i if \i else \i fi The general rule for all `\if*` commands is `<if condition><true text>/\else<false text>.\fi`. The `<if condition>` is evaluated and `<true text>` or `<false text>` is skipped or processed depending on the result of `<if condition>`. When the expand processor is skipping the text due to an `\if*` command, it expands nothing in the skipped text. But it is noticing all control sequences with meaning `\if*`, `\else` and `\fi` during skipping in order to skip correctly all nested `\if*.../\else....\fi` constructions. The following `<if condition>`s are possible: \_printitem={$\circ$\enspace} * \i if \z`\if<token 1><token 2>` is true if \begitems \removelastskip \style a * both tokens are the same characters or * both tokens are control sequences (with arbitrary meaning but not \"the character") or * one token is a character, second is a control sequence equal to this character (by `\let`) or * both tokens are control sequences, their meaning (set by `\let`) is the same character. \enditems In a), c) and d), only character codes are compared, no their category codes.\nl Example: you can say `\let\test=a` then `\if\test a` returns true. * \i ifx \z`\ifx|<token 1><token 2>;` is true if the meanings of \z`<token 1>` and \z`<token 2>` are the same. * \*\i ifcsname `\ifcsname<text>\endcsname` is true if the control sequance `<text>` is declared. * \i ifnum `\ifnum<number 1><relation><number 2>`. The `<relation>` could be \code{<} or \code{=} or \code{>}. It returns true if the comparison of the two numbers is true. * \i ifodd `\ifodd<number>` returns true if the `<number>` is odd. * \i ifdim `\ifdim<dimen><relation><dimen>` The `<relation>` could be \code{<} or \code{=} or \code{>}. It returns true if the comparison of the two dimensions is true. * \x`\iftrue` returns constantly true, \x`\iffalse` returns constantly false. * \x`\ifhmode`, \x`\ifvmode`, \x`\ifmmode` -- true if the current mode is horizontal, vertical, math. * \x`\ifinner` returns true if the current mode is internal vertical, internal horizontal or internal math mode. * \i ifhbox `\ifhbox<box number>`, \i ifvbox `\ifvbox<box number>`, \i ifvoid `\ifvoid<box number>` returns true if the specified `<box number>` represents {\noda\x`\hbox`, \x`\vbox`}, void box respectively. * \i ifcat \z`\ifcat<token 1><token 2>` is true if the category codes of \z`<token 1>` and \z`<token 2>` are equal. * \i ifeof `\ifeof<file number>` is true if the file attached to the `<file number>` by the {\noda\x`\openin`} primitive does not exist, or the end of file was reached by the {\noda\x`\read`} primitive. \_printitem{$\bullet$\enspace} * \*\i unless `\unless<if condition>` negates the result of `<if condition>` before skipping or processing the following text. * \i ifcase `\ifcase<number><case 0>\or<case 1>\or<case 2>` `...` `\or<case n>/\else<else text>.\fi`. This processes the branch given by `<number>`. It processes `<else text>` (or nothing if no `<else text>` is declared) when a branch with a given `<number>` does not exist. * \*\i pdfstrcmp `\pdfstrcmp{<stringA>}{<stringB>}` returns $-1$ if <stringA>$\string<$<stringB>, 0 if they are equal or 1 in other cases. It is not implemented in \LuaTeX. * \i noexpand `\noexpand|<token>;`. The expand processor does not expand the `<token>` if it is expanding the text in {\noda\x`\edef`, \x`\write`, \x`\message`} or similar lists. * \*\i unexpanded `\unexpanded/<expandafters>.`\z`{|<text>};` returns \z`<text>` and applies `\noexpand` to all tokens in the \z`<text>`. * \*\*\i expanded `\expanded{<tokens>}` expands `<tokens>` and reads these expanded `<tokens>` again. * \*\i numexpr `\numexpr<num. expression>`, \i dimexpr \*`\dimexpr<dimen expression>`. Documented in the `<dimen>` and `<number>` syntax rules in section~\ref[reg]. * \i number `\number<number>`, \i romannumeral `\romannumeral<number>` prints <number> in decimal digits or as a roman numeral (with lowercase letters). * \x`\topmark` (last from previous page), \x`\firstmark` (first on current page), \x`\botmark` (last on current page). They expand to the corresponding {\noda\x`\mark`} included in the current or previous page-box. Usable for implementing running headers in the output routine. * \i fontname `\fontname<font selector>` expands to the file name \*\*\*(or font name) of the font given by its `<font selector>`. The `\fontname\font` expands to the file name of the current font. * \x`\jobname` expands to the name of the main file of this document (without extension `.tex`). * \i input `\input<file name><space>` (classical \TeX) or \ `\input"<file name>"` \ or \ `\input{<file name>}` \ opens the given `<file name>` and starts to read input from it. If the `<file name>` doesn't exist then \TeX/ tries again to open \,{\def\,{\kern-1pt}`<file name>.tex`}. If that doesn't exist, \TeX/ reports an error. The alternative syntax with `"..."` or `{...}` allows having spaces in the file names. * \i endinput `\endinput`. The current line is the last line of the file being input. The file is closed and reading continues from the place where `\input` of this file was started. `\endinput` done in the main file causes future reading from the terminal and a headache for the user. * \*\*\*\i Uchar \x`\Uchar``<number>` expands to a Unicode character with given code <number>. (\XeTeX/ too). * \*\*\*\i directlua \z`\directlua {<text>}` runs a Lua script given in \z`<text>`. * \*\*\*\i luaescapestring \z`\luaescapestring {<text>}` prepares `<text>` for usage as Lua string (escapes `"` and `\`). * \*\*\*\i immediateassignment `\immediateassignment`, \i immediateassigned \z`\immediateassigned {<code>}` do following assignment (or assignments in `<code>`) expandable. \enditems \sec[main] Primitive commands at the main processor level {\bf Commands used for declaration of control sequences} \par\nobreak\medskip\nobreak \begitems * \x`\def`, \x`\edef`, \x`\gdef`, \x`\xdef` were documented in section~\ref[def]. * \x`\long` is a prefix; it can be used before `\def`, `\edef`, `\gdef`, `\xdef`. The declared macro accepts the control sequence `\par` in its parameters. * \*\x`\protected` is a prefix; it can be used before `\def`, `\edef`, `\gdef`, `\xdef`. The declared macro is not expanded by the expand processor in {\noda\x`\write`, \x`\message`, \x`\edef`}, etc., parameters. * \x`\outer` is a prefix; it can be used before `\def`, `\edef`, `\gdef`, `\xdef`. The declared macro must be used only when the main processor is in the context {\em do something} or \TeX/ reports an error. * \x`\global` is a prefix; it can be used before any assignment (commands from this subsection and `<register>=<value>` settings). The assignment is global regardless of the current group. * \i chardef `\chardef|<control sequence>;=<number>`, \i mathchardef `\mathchardef|<control sequence>;=<number>` \ declares a constant <number>. When the main processor is in the context {\em do something} and it gets a \x`\chardef`-ed control sequence, it prints the character with Unicode (ASCII code) `<number>` to the typesetting output. If it gets a \x`\mathchardef`-ed control sequence, it prints a math object (it works only in math mode). * \i countdef `\countdef|<control sequence>;=<number>` declares `<control sequence>` as an equivalent to the `\count<number>` which is a register of counter type. The `<number>` here means an address in the array of registers of counter type. The `\count0` is reserved for the page number. Macro programmers rarely use direct addresses (1 to 9), more common is using the allocation macro `\newcount|<control sequence>;`. * \x`\dimendef`, \x`\skipdef`, \x`\muskipdef`, \x`\toksdef` followed by `|<control sequence>;=<number>` declare analogically equivalents to \i dimen `\dimen<number>`, \i skip `\skip<number>`, \i muskip `\muskip<number>` and \i toks `\toks<number>`. Usage of allocation macros {\noda\x`\newdimen`, \x`\newskip`, \x`\newmuskip`, \x`\newtoks`} are preferred. * \i font `\font|<font selector>;=<file name><space>/<size specification>.` declares `<font selector>` of a font implemented in the `<file name>.tfm`. The `<size specification>` can be `at<dimen>` or `scaled<factor>`. The `<factor>` equal to {\tt 1000} means 1:1. New syntax (supported by Unicode engines) is \begtt \catcode`\|=13 \catcode`\/=13 \catcode`\<=13 \Blue \font|<font selector>;="<font name>/:<font features>." /<size specification>. % or \font|<font selector>;="[<font file>]/:<font features>." /<size specification>. \endtt % The `<font file>` is a file name without an `.otf` or `.ttf` extension. The `<font features>` are font features prefixed by `+` or `-` and separated by a semicolon. The {\let\,=\relax `otfinfo -f <file name>.otf`} command (on command line) can list them. Lua\TeX/ supports alternative syntax: `{...}` instead of `"..."`. Example: `\font\test={[texgyretermes-regular]:+onum;-liga} at12pt`. There is default font selector \x`\nullfont` which selects an \"empty font". * \i let `\let|<control sequence>=<token>;` sets to the `<control sequence>` the same meaning as `<token>` has. The `<token>` can be whatever, a character or a control sequence. * \*\*\* \x`\glet` is equal to `\global\let`. * \i futurelet \z`\futurelet|<control sequence><token 1><token 2>;` works in two steps. In the first step it does \z`\let|<control sequence>=<token 2>;` and in the second step \z`<token 1><token 2>` is processed with activated token processor. Typically \z`<token 1>` is a macro that needs to know the next token. \enditems \goodbreak \noindent {\bf Commands for box manipulation} \begitems * \i hbox `\hbox{<cmds>}` or `\hbox to<dimen>{<cmds>}` or `\hbox spread<dimen>{<cmds>}` creates a box. The material inside this box is a `<horizontal list>` generated by `<cmds>` in horizontal mode in a group. The width of the box is the natural width of the `<horizontal list>` or `<dimen>` given by the \ii to `to<dimen>` parameter or it is spread by the `<dimen>` given by the \ii spread `spread<dimen>` parameter. The height of the box is the maximum of heights of all elements in the `<horizontal list>`. The depth of the box is the maximum of depths of all such elements. These elements are set on the common baseline (exceptions can be given by {\noda\x`\lower` or \x`\raise`} commands). * \i vbox `\vbox{<cmds>}` or `\vbox to<dimen>{<cmds>}` or `\vbox spread<dimen>{<cmds>}` creates a box. The material inside this box is a `<vertical list>` generated by `<cmds>` in vertical mode in a group. The height of the box is the natural height of the `<vertical list>` (eventually modified by values from \ii to `to` or \ii spread `spread` parameters) without the depth of the last element. The depth of the last element is set as the depth of the box. The width of the box is the maximum of widths of elemens in the `<vertical list>`. All elements are placed at the common left margin of the box (exceptions can be given by {\noda\x`\moveleft` or \x`\moveright`} commands). * \i vtop `\vtop{<cmds>}` (with optional `to` or `spread` parameters) is the same as `\vbox`, but the baseline of the resulting box goes through the baseline of the first element in the `<vertical list>` (note that `\vbox` has its baseline equal to the baseline of the last element inside). * \i vcenter `\vcenter{<cmds>}` (with optional `to` or `spread` parameters) is equal to `\vbox`, but its \ii math/axis {\em math axis}\fnote {The math axis is a horizontal line which goes through centers of + and $-$ symbols. Its distance from the baseline is declared in the math font metrics.} is exactly in the middle of the box. So its baseline is appropriately shifted. The `\vcenter` can be used only in math modes but given `<cmds>` are processed in vertical mode. * \i lower `\lower<dimen><box>`, \i raise `\raise<dimen><box>` move the `<box>` up or down by the `<dimen>` in horizontal mode. \i moveleft `\moveleft<dimen><box>`, \i moveright `\moveright<dimen><box>` move the `<box>` by the `<dimen>` in vertical mode. * \i setbox `\setbox<box number>=<box>`. \TeX/ has a set of \ii box/register {\em box registers} addressed by `<box number>` and accessed via \i box `\box<box number>` or alternatives described below. The `\setbox` command saves the given `<box>` to the register addressed by `<box number>`. Macro programmers use only 0 to 9 \z`<box numbers>` directly. Other addresses to box registers should be allocated by the {\noda\i newbox `\newbox|<control sequence>;`} macro. The `|<control sequence>;` is equivalent to a `<box number>`, not to the box register itself. The `\setbox` command does an assignment, so the \x`\global` prefix is needed if you want to use the saved box outside the current group. * \i box `\box<box number>` returns the box from `<box number>` box register. Example: you can do `\setbox0=\hbox{abc}`. This `\hbox` isn't printed but saved to the register 0. At a different place you use `\box0`, which prints `\hbox{abc}`, or you can do `\setbox0=\hbox{cde\box0}` which saves the `\hbox{cde\hbox{abc}}` to the register~0. * \i copy `\copy<box number>` returns the box from `<box number>` box register and keeps the same box in this box register. Note that the \i box `\box<box number>` returns the box and empties the register `<box number>` immediately. If you don't want to empty the register, use `\copy`. * \i wd `\wd<box number>`, `\ht<box number>`, `\dp<box number>`. You can measure or use the width, height and depth of a box saved in a register addressed by `<box number>`. Examples `\mydimen=\ht0`, `\hbox to\wd0{...}`. You can re-set the dimensions of a box saved in a register addressed by `<box number>`. For example \i setbox `\setbox0=\hbox{abc}` `\wd0=0pt` `\box0` gives the same result as \i hbox `\hbox to0pt{abc}` but without the warning about \ii overfull/box overfull `\hbox`. * \i unhbox `\unhbox<box number>`, \kern-.4pt\i unvbox `\unvbox<box number>`, \kern-.4pt\i unhcopy `\unhcopy<box number>`, \kern-.4pt\i unvcopy `\unvcopy<box number>` do the same work as `\box` or `\copy` but they don't return the whole box but only its contents, i.e.~the horizontal or vertical material. Example: try to do \i setbox `\setbox0=\hbox{abc}` and later `\setbox0=\hbox{cde\unhbox0}` saves the \i hbox `\hbox{cdeabc}` to the box register~0. The \x`\unhbox` and \x`\unhcopy` commands return the \x`\hbox` contents and \x`\unvbox`, \x`\unvcopy` commands return the \x`\vbox` contents. If incompatible contents are saved, then \TeX/ reports an error. You can test the type of saved contents by \x`\ifhbox` or \x`\ifvbox`. * \i vsplit `\vsplit<box number> to<dimen>` does a column break. The `<vertical material>` saved in the box `<box number>` is broken into a first part of `<dimen>` height and the rest remains in the box `<box number>`. The broken part is saved as a `\vbox` which is the result of this operation. For example, you can say `\newbox\column` `\setbox\column=\vbox{...}` and later `\setbox0=\vsplit\column to5cm`. The `\box0` is a `\vbox` containing the first 5cm of saved material. And the `\column` box includes the rest of the material. * \x`\lastbox` returns the last box in the current vertical or horizontal material and removes it. \enditems \noindent {\bf Commands for rules (lines in the typesetting output) and patterns} \begitems * \x`\hrule` creates a horizontal line in the current vertical list. If it is used in horizontal mode, it finishes the paragraph by {\noda\x`\par`} first. `\hrule /width<dimen>. /height<dimen>. /depth<dimen>.` creates (in general, with given parameters) a full rectangle (something like a box, but it isn't treated as the box) with given dimensions. Default values are: \"width"~=width of outer \x`\vbox`, \"height"~=0.4\,pt, \"depth"~=0\,pt. {\emergencystretch=2em\par} * \x`\vrule` creates a vertical line in the current horizontal list. If it is used in vertical mode, it opens the horizontal mode first. `\vrule /width<dimen>. /height<dimen>. /depth<dimen>.` creates (in general, with given parameters) a full rectangle with given dimensions. Default values are: \"width"~=0.4\,pt, \"height"~=height of outer `\hbox`, \"depth"~=depth of outer \x`\hbox`. {\emergencystretch=2em\par} The optional parameters of \x`\hrule` and \x`\vrule` can be specified in arbitrary order and they can be specified more than once. In such a case, the rule \"last wins" is applied. * \i leaders `\leaders<rule><glue>` creates a glue (maybe shrinkable or stretchable) filled by a full rectangle. The `<rule>` is \x`\vrule` or \x`\hrule` (maybe with its optional parameters). If the `<glue>` is specified by an {\noda\x`\hskip`} command (maybe with its optional parameters) or by its alternatives {\noda\x`\hss`, \x`\hfil`, \x`\hfill`}, then the resulting glue is horizontal (can be used only in horizontal mode) and its dimensions are: width derived from `<glue>`, height plus depth derived from `<rule>`. If the `<glue>` is specified by a {\noda\x`\vskip`} command (maybe with its optional parameters) or by its alternatives {\noda\x`\vss`, \x`\vfil`, \x`\vfill`}, then the resulting glue is vertical (can be used only in vertical mode) and its dimensions are: height derived from `<glue>`, width derived from `<rule>`, depth is zero. * \i leaders `\leaders<box><glue>` creates a vertical or horizontal glue filled by a pattern of repeated `<box>`. The positions of boxes are calculated from the boundaries of the outer box. It is used for the dots patterns in the table of contents. \i cleaders `\cleaders<box><glue>` does the same, but the pattern of boxes is centered in the space derived by the <glue>. Spaces between boxes are not~inserted. \i xleaders `\xleaders<box><glue>` does the same, but the spaces between boxes are inserted equally. \enditems \noindent {\bf More commands for creating something in typesetting output} \begitems * \x`\par` closes horizontal mode and finalizes a paragraph. In vertical mode, it does nothing. * \x`\indent`, \x`\noindent`. They leave vertical mode and open a paragraph with/without paragraph indentation. If horizontal mode is current then `\indent` inserts an empty box of `\parindent` width; `\noindent` does nothing. * \x`\hskip`, \x`\vskip`. They insert a horizontal/vertical glue. Documented in section~\ref[boxes]. * \x`\hfil`, \x`\hfill`, \x`\hss`, \x`\vfil`, \x`\vfill`, \x`\vss` are alternatives of \x`\hskip`, \x`\vskip`, see section~\ref[boxes]. * \x`\hfilneg`, \x`\vfilneg` are shortcuts for `\hskip 0pt plus-1fil` and `\vskip 0pt plus-1fil`. * \i kern `\kern<dimen>` puts unbreakable horizontal/vertical space depending on the current mode. * \i penalty `\penalty<number>` puts the penalty `<number>` on the current horizontal/vertical list. * \i char `\char<number>` prints the character with code \kern-2pt`<number>`\kern-2pt. The \"character itself" does the same. * \i accent `\accent<number><character>` places an accent with code `<number>` above the `<character>`. * \ii -space `\`{\tt\char9251} is the \ii control/space control space. In horizontal mode, it inserts the space glue (like normal space but without modification by the \x`\spacefactor`). In vertical mode, it opens horizontal mode and puts the space. Note that normal space does nothing in vertical mode. * \i discretionary `\discretionary{<pre break>}{<post break>}{<no break>}` works in horizontal mode. It prints `<no break>` in normal cases but if there is a line break then `<pre break>` is used before and `<post break>` after the breaking point. German Zucker/Zuk-ker (sugar) can be implemented by `Zu\discretionary{k-}{k}{ck}er`. * \ii -hyphen `\-` is equal to \i hyphenchar `\discretionary{\char\hyphenchar<font>}{}{}`. The `\hyphenchar<font>` is used as a hyphenation character. It is set to \x`\defaulthyphenchar` value when the font is loaded, but it can be changed. * \ii -italiccorr \code{\\/} does an \ii italic/correction italic correction. It puts a little space if the last character is slanted. * \x`\unpenalty`, \x`\unskip` removes the last penalty/last glue from the current horizontal/vertical list. * \i vadjust `\vadjust{<cmds>}`. This works in horizontal mode. The `<cmds>` must create a `<vertical list>` and `\vadjust` saves a pointer to this list into the current horizontal list. When `\par` creates lines of the paragraph and distributes them to a vertical list, each line with the pointer from `\vadjust` has the corresponding `<vertical list>` immediately appended after this line. * `\insert<number>{<cmds>}`. The `<cmds>` create a `<vertical list>` and `\insert` saves a pointer to such a `<vertical list>` into the current list. The output routine can work with such `<vertical list>`\kern-2pts. The footnotes or \ii floating~object {\em floating objects} (tables, figures) are implemented by the `\insert` primitive. * \i halign `\halign{<declaration>\cr`\z`<row 1>\cr<row 2>\cr...\cr<row n>\cr}` creates a table of boxes in vertical mode. The `<declaration>` declares one or more column patterns separated by `&`\c4. The rows use the same character to separate the items of the table in each row. The `\halign` works in two passes. First it saves all items to boxes and the second pass performs `\hbox to`~$\Blue w$ for each saved item, where $\Blue w$ is the maximum width of items in each actual column. Detailed documentation of `\halign` is out of scope of this manual. Only one example follows: the macro `\putabove` puts `#1` above `#2` centered. The width of the resulting box is equal to the maximum of widths of these two parameters. The `<declaration>` `\hfil##\hfil` means that the items will be centered:\nl `\def\putabove#1#2{\vbox{\halign{\hfil##\hfil\cr#1\cr#2\cr}}}`. * \x`\valign` does the same as `\halign` but rows $\leftrightarrow$ columns. It is not commonly used. * \x`\cr`, \x`\crcr`, \x`\span`, \x`\omit`, \i noalign `\noalign{<cmds>}` are primitives used by `\halign` and `\valign`. \enditems \noindent {\bf Commands for register calculations} \begitems * \i advance `\advance<register>/by.<value>` does (formally) `<register>=<register>+<value>`. The `<register>` is counter type or dimen type. The `<value>` is `<number>` or `<dimen>` (depending on the type of `<register>`). * \i multiply `\multiply<register>/by.<number>` does `<register>=<register>*<number>`. * \i divide `\divide<register>/by.<number>` does `<register>=<register>`\code{/}`<number>`. If the `<register>` is number type then the result is truncated. * See \*`\numexpr` and \*`\dimexpr`, expandable primitives documented in sections~\ref[reg] and~\ref[expand]. * \*\*\i pdfuniformdeviate `\pdfuniformdeviate<number>` expands to a random number uniformly distributed in the range 0 (inclusive) to `<number>` (exclusive). Normal distribution between $-65536$ and $65536$ can be reached by \*\*\i pdfnormaldeviate `\pdfnormaldeviate`. The generator is initialized by time of the compilation, or you can use \*\*\i pdfsetrandomseed `\pdfsetrandomseed<number>` to do fixed initialization, `<number>` is an integer less than 1,000,999,999. Luatex supports the same primitives but without `\pdf` prefix. \enditems \noindent {\bf Internal codes} \begitems * \i catcode `\catcode<number>` is category code of the character with `<number>` code. Used by tokenizer. * \i lccode `\lccode<number>` is the lowercase alternative to the `\char<number>`. If it is zero then a lowercase alternative doesn't exist (for example for punctuation). Used by the `\lowercase` primitive and when breaking points are calculated from hyphenation patterns. * \i uccode `\uccode<number>` is the uppercase alternative to the `\char<number>`. If it is zero, then the uppercase alternative doesn't exist. Used by the `\uppercase` primitive. * \i lowercase \i uppercase `\lowercase/<expandafters>.`\z`{|<text>};`, \ `\uppercase/<expandafters>.`\z`{|<text>};` transform \z`|<text>;` to lowercase/uppercase using the current `\lccode` or `\uccode` values. Returns transformed \z`|<text>;` where catcodes of tokens and tokens of type `<control sequence>` are unchanged. * \i sfcode `\sfcode<number>` is the spacefactor code of the `\char<number>`. The `\spacefactor` register keeps (roughly speking) the `\sfcode` of the last printed character. The glue between words is modified (roughly speaking) by this `\spacefactor`. The value 1000 means factor 1:1 (no modification is done). It is used for enlarging spaces after periods and other punctuation in English texts.\fnote{ This feature is not compliant with other typographical traditions, so the `\frenchspacing` macro which sets all `\sfcodes` to 1000 is used very often.} {\emergencystretch=2em\par} \enditems \noindent {\bf Commands for reading or writing text files} \begitems * Note that the main input stream is controlled by `\input` and `\endinput` expandable primitive commands documented in section~\ref[expand]. * \i openin `\openin<file number>`\,`=`\,`<file name><space>` \ (or `\openin`\,`<file number>`\,`=`\,`{<file name>}`) \ opens the file `<file name>` for reading and creates a file descriptor connected to the `<file number>`.\fnote {\noda Note that `<file number>` is an address to the file descriptor. Macro programmers don't use these addresses directly but by the \i newread `\newread|<control sequence>;` and \i newwrite `\newwrite|<control sequence>;` allocation macros.} If the file doesn't exist nothing happens but a macro programmer can test this case by `\ifeof<file number>`. * \i read `\read<file number>to|<control sequence>;` does `\def|<control sequence>{<replacement text>};`\nl where the `<replacement text>` is the tokenized next line from the file declared by `\openin` as `<file number>`. * \i openout `\openout<file number>=<file name><space>` (or `\openout<file number>="<file name>"`) \ opens the `<file name>` for writing and creates a file descriptor connected to `<file number>`. If the file already exists, then its contents are removed. {\emergencystretch=2em\par} * \i write `\write<file number>`\z`|{<text>};` writes a line of \z`<text>` to the file declared by `\openout` as `<file number>`. But this isn't done immediately. \TeX/ does not know the value of the current page when the `\write` command is processed because the paragraph building and page building algorithms are processed asynchronously. But a macro programmer typically needs to save current page to the file in order to read it again and to create a Table of contents or an Index. `\write<file number>`\z`|{<text>};` saves \z`|<text>;` into memory and puts a pointer to this memory into the typesetting output. When the page is shipped out (by output routine), then all such pointers from this page are processed: the \z`<text>` is expanded at this time and its expansion is saved to the file. If (for example) the \z`<text>` includes `\the\pageno` then it is expanded to the correct page number of this page. * \i closein `\closein<file number>`, \i closeout `\closeout<file number>` closes the open file. It is done automatically when \TeX/ terminates its job. * \x`\immediate` is a prefix. It can be used before `\openout`, `\write` and `\closeout` in order to do the desired action immediately (without waiting for the output routine). \enditems \noindent {\bf Others primitive commands} \par\nobreak\medskip\nobreak \begitems * \x`\relax` does nothing. Used for terminating incomplete optional parameters, for example. * \x`\begingroup` opens group, \x`\endgroup` closes group. The `{`\c1 and `}`\c2 do the same but moreover, they are syntactic constructors for primitive commands and math lists (in math mode). These two types of groups (declared by mentioned commands or by mentioned characters) cannot be mixed, i.e.\ `\begingroup...}` gives an error. Plain \TeX/ declares {\noda\x`\bgroup` and \x`\egroup`} control sequences as equivalents to `{`\c1 and `}`\c2. They can be used instead of `{`\c1 and~`}`\c2 when we need to open/close a group, to create a math list, or when a box is constructed. For example, \z`\hbox\bgroup<text>\egroup` is syntactically correct. * \i aftergroup `\aftergroup|<token>;` saves the `|<token>;` and puts it back in the input queue immediately after the current group is closed. Then the expand processor expands it (if it is expandable). More `\aftergroup`s in one group create a queue of `|<token>;`s used after the group is closed. * \i afterassignment `\afterassignment|<token>;` saves the `|<token>;` and puts it back immediately after a following assignment (`<register>=<value>`, `\def`, etc.)\ is done. * \x`\lastskip`, \x`\lastpenalty` return the value of the last element in the current horizontal or vertical list if it is a glue/penalty. It returns zero if the element found is not the last. * \x`\ignorespaces` ignores spaces in horizontal mode until the next primitive command occurs. * \i mark \z`\mark{<text>}` saves \z`<text>` to memory and puts a pointer to it in the typesetting output. The \z`<text>` is used as expansion output of \x`\firstmark`, \x`\topmark` and \x`\botmark` expansion primitives in the output routine. * \i parshape `\parshape<number>`% {\def<#1>{\,$\langle\it#1\rangle$\,}% `<I1><W1><I2><W2>...<In><Wn>` enables to set arbitrary shape of the paragraph. The `<number>` declares the amount of data: the `<number>` pairs of `<dimen>`s follow. The $i$-th line of the paragraph is shifted by `<Ii>` to the right and its width is `<Wi>`.} The `\parshape` data are re-set after each paragraph to zero values (normal paragraph). * \i special \z`\special{<text>}` puts the message \z`<text>` into the typesetting output. It behaves as a zero-dimension pointer to \z`<text>` and it can be read by printer drivers. It is recommended to not use this old technology when PDF output is created directly. * \i shipout `\shipout<box>` outputs the `<box>` as one page. Used in the \ii output/routine output routine. * \x`\end` completes the last page and terminates the job. * \x`\dump` dumps the memory image to a file named `\jobname.fmt` and terminates the job. * \i patterns `\patterns{<data>}` reads hyphenation patterns for the current \x`\language`. * \i hyphenation `\hyphenation{<data>}` reads hyphenation exceptions for current \x`\language`. * \i message \z`\message{<text>}` prints \z`<text>` on the terminal and to the log file. * \i errmessage \z`\errmessage{<text>}` behaves like \z`\message{<text>}` but \TeX/ treats it as an error. * Job processing modes can be set by \x`\scrollmode` (don't pause at errors), \x`\nonstopmode` (don't pause at errors or missing files), \x`\batchmode` (\x`\nonstopmode` plus no output to the terminal). Default is \x`\errorstopmode` (stop at errors). * \x`\inputlineno` includes the number of the current line from current file being input. * \i show `\show|<control sequence>;`, \ \i showbox `\showbox<box number>`, \ \x`\showlists`, \ and \ \i showthe `\showthe|<register>;` \ are tracing commands. \TeX/ prints desired result on the terminal and to the log file and pauses. \enditems \noindent {\bf Commands specific for PDF output} (available in pdf\TeX, \XeTeX/ and Lua\TeX) \begitems * \i pdfoutput `\pdfoutput` is numeric register. If its value is 1 then PDF format is geneerated. * \i pdfliteral \z`\pdfliteral{<text>}` puts the \z`<text>` interpreted in a low level PDF language to the typesetting output. All PDF constructs defined in the PDF specification are allowed. The dimensions of the `\pdfliteral` object in the output are considered zero. So, if \z`<text>` moves the current typesetting point then the notion about its position from the \TeX/ point of view differs from the real position. A good practice is to close \z`<text>` to `q...Q` PDF commands. The command `\pdfliteral` is typically used for generating graphics and for linear transformation. * \i pdfsave \i pdfrestore `\pdfsave`, `\pdfrestore` saves and restores PDF graphics stack (like `q`, `Q` PDF commands). * \i pdfcolorstack `\pdfcolorstack<number><op>`\z`{<text>}` (where `<op>` is `push` or `pop` or `set`) behaves like \z`\pdfliteral{<text>}` and it is used for color switchers. For example when \z`<text>` is `1 0 0 rg` then the red color is selected. \TeX/ sets the color stack at the top of each page to the color stack opened at the bottom of the previous page. * \i pdfximage `\pdfximage` `/height<dimen>.` `/depth<dimen>.` `/width<dimen>.` `/page<number>.{<file name>}` loads the image from `<file name>` to the PDF output and returns the number of such a data object in the \x`\pdflastximage` register. Allowed formats are PDF, JPG, PNG. The image is not drawn at this moment. A macro programmer can save `\mypic=\pdflastximage` and draw the image by \i pdfrefximage `\pdfrefximage\mypic` (maybe repeatedly). Data of the image are loaded to the PDF output only once. The `\pdfximage` allows more parameters; see pdf\TeX/ documentation. * \i pdfsetmatrix {\def<#1>{$\,\langle{\it#1}\rangle\,$}`\pdfsetmatrix {<a><b><c><d>}` multiplies the current transformation matrix (used for linear transformations) by `\matrix{<a>&<c>\cr <b>&<d>}`.} * \i pdfdest `\pdfdest name{<label>}<type>\relax` declares a destination of a hyperlink. The `<label>` must match with the `<label>` used in `\pdfoutline` or `\pdfstartlink`. The `<type>` declares the behavior of the pdf viewer when the hyperlink is used. For example, `xyz` means without changes of the current zoom (if not specified). Other types should be `fit`, `fith`, `fitv`, `fitb`. * \i pdfstartlink `\pdfstartlink` `/height<dimen>.` `/depth<dimen>.` `/<attributes>.` `/goto name{<label>}.` declares the beginning of a hyperlink. A text (will be sensitive on mouse click) immediately follows and it is terminated by \x`\pdfendlink`. The height and depth of the sensitive area and the `<label>` used in `\pdfdest` are declared here. More parameters are allowed; see the pdf\TeX/ documentation. * \i pdfoutline \z`\pdfoutline /goto name{<label>}. /count<number>. {<text>}` creates one item with \z`<text>` in PDF outlines. `<label>` must be used somewhere by `\pdfdest name{<label>}`. The `<number>` is the number of direct descentants in the outlines tree. * \i pdfinfo `\pdfinfo {<key>`\z`(<text>)}` saves to PDF the information which can be listed by the command `pdfinfo <file>.pdf` on the command line for example. More `<key>`\z`(<text>)` should be here. The `<key>` can be \code{/Author}, \code{/Title}, \code{/Subject}, \code{/Keywords}, \code{/Creator}, \code{/Producer}, \code{/CreationDate}, \code{/ModDate}. The last two keywords need a special format of the \z`<text>` value. All \z`<text>` values (including \z`<text>` used in the `\pdfoutline`) must be ASCII encoded or they can use a very special PDFunicode encoding. * \x`\pdfcatalog` enables us to set of a default behavior of the PDF viewer when it starts. * \x`\pdfsavepos` saves an internal invisible point to the typesetting output. These points are processed when the page is shipped out: the numeric registers \x`\pdflastxpos` and \x`\pdflastypos` get values for the absolute position of this invisible point (measured from the left upper corner of the page in `sp` units). The macro programmer can follow `\pdfsavepos` by the `\write` command and save these absolute positions to a text file which can be read in the next run of \TeX/ in order to get these absolute positions by macros. \enditems \noindent {\bf Selected \LuaTeX/ primitives} \begitems * \x`\pdffeedback`, \x`\pdfextension`, \x`\pdfvariable` declare pdf\TeX/ \"primitives" in \LuaTeX. * Moreover, \LuaTeX/ uses different names for several primitives: {\spaceskip=.3em plus.7em minus.1em \x`\pagewidth` is \q`\pdfpagewidth`, \x`\pageheight` is \q`\pdfpageheight`, \x`\outputmode` is \q`\pdfoutput`, \x`\savepos` is \q`\pdfsavepos`, \x`\lastxpos` is \q`\pdflastxpos`, \x`\lastypos` is \q`\pdflastypos`, \x`\outputmode` is \q`\pdfoutput`, \x`\saveimageresource` is \q`\pdfximage`, \x`\lastsavedimageresourceindex` is \q`\pdflastximage`, \x`\useimageresource` is \q`\pdfrefximage`, \x`\protrudechars` is \q`\pdfprotrudechars`, \x`\normaldeviate` is \q`\pdfnormaldeviate`, \x`\uniformdeviate` is \q`\pdfuniformdeviate`, \x`\setrandomseed` is \q`\pdfsetrandomseed`. } * \x`\catcodetable`, \x`\initcatcodetable`, \x`\savecatcodetable` do copy of whole catcode tables and enable to switch between them. * \x`\attributedef` declares a sequence as an attribute number. Similar like \q`\countdef`. * \x`\suppressfontnotfounderror` and others switch on/off error reporting in specified cases. * \x`\matheqdirmode`, \x`\breakafterdirmode` one of many parameters for right-to-left typesetting. * \x`\crampeddisplaystyle`, \x`\crampedtextstyle`, \x`\crampedscriptstyle`,\nl \x`\crampedscriptscriptstyle` is a reduced math style (below fraction line, for example). \enditems There are many additional primitives in \LuaTeX/, see its \ulink[https://www.pragma-ade.com/general/manuals/luatex.pdf]{documentation}. Only few of them are mentioned here. \medskip \noindent {\bf Microtypographical extensions} (available in pdf\TeX/, Lua\TeX/ and not all of them in \XeTeX) \begitems * \i pdffontexpand `\pdffontexpand <font selector> <stretching> <shrinking> <step>` declares a possibility to deform the characters from the font given by `<font selector>`. This deformation is used when stretching or shrinking paragraph lines or doing `\hbox to{...}` in general. I.e.\ not only glues are stretchable and shrinkable. The numeric parameters are given in 1/1000 of the font size. `<stretching>` and `<shrinking>` are the maximum allowed values. The stretching or shrinking are not applied continuously but by the given `<step>`. To activate this feature you must set the \x`\pdfadjustspacing` numeric register to a positive value. * \i efcode `\efcode <font selector><char. code>=<number>` sets the degree of willigness of given character to be deformed when `\pdffontexpand` is used. Default value for all characters is 1000 and `<number>`/1000 gives the proportion coefficient for stretching or shrinking of the character with respect to the \"normal" deformation of characters with default value 1000. * \i rpcode \i lpcode `\rpcode <font selector><char. code>=<number>`, `\lpcode <font selector><char. code>=<number>` allows the declaration of hanging punctuation. Such punctuation is slightly moved to the right margin (if `\rpcode` is declared and the character is at the right margin) or to the left margin (for `\lpcode` by analogy). The `<number>` gives the amount of such movement in 1/1000 of the font size. To activate this feature you must set \x`\pdfprotrudechars` to a positive value (2 or more means a better algorithm). * \i letterspacefont `\letterspacefont |<control sequence>; <font selector> <number>` declares a new font selector `|<control sequence>;` as a font given by the `<font selector>`. Additional space declared by `<number>` is added between each two characters when the font is used. The `<number>` is 1/1000 of the font size. Unicode fonts support an analogous `letterspace=<number>` font feature. * The following commands have the same syntax as `\rpcode`: \x`\knbscode` (added space after the character), \x`\stbscode` (added stretchability of the glue after the character), \x`\shbscode` (added shrinkability after the character), \x`\knbccode` (added kern before the character), \x`\knaccode` (added kern after the character). To activate this feature you must to set \x`\pdfadjustinterwordglue` to a positive value. This feature is supported by pdf\TeX/ only. \enditems \goodbreak\noindent {\bf Commands used in math mode} \begitems * \x`\displaystyle`, \x`\textstyle`, \x`\scriptstyle`, \x`\scriptscriptstyle` are `<style primitive>`\kern-2pts. They switch to the specified style. \i mathchoice `\mathchoice{<D>}{<T>}{<S>}{<SS>}` prints only one its agrument dependent on the current math style. * \x`\mathord`, \x`\mathop`, \x`\mathbin`, \x`\mathrel`, \x`\mathopen`, \x`\mathclose`, \x`\mathpunct` followed by `{<math list>}` create a math object of the given class. * \i over `{<numerator>\over<denominator>}` creates a fraction. The primitive commands \x`\atop` (without fraction rule), \i above `\above<dimen>` (fraction rule with given thickness) should be used in the same manner. The commands \x`\atopwithdelims`, \x`\overwithdelims`, \x`\abovewithdelims` allow us to specify brackets around the generalized fraction. * \i left \i right `\left<delimiter><formula>\right<delimiter>` creates a math `<formula>` and gives `<delimiter>`s around it with an appropriate size (compatible with the size of the formula). The `<delimiter>`s are typically brackets. * \*\i middle `\middle<delimiter>` can be used inside the <formula> surronded by `\left`, `\right`. The given <delimiter> gets the same size as delimiters declared by appropriate `\left`, `\right`. * Exponents and scripts are typically at the right side of the preceding math object. But if this object is a \"big operator" (summation, integral) then exponents and scripts are printed above and below this operator. The commands \x`\limits`, \x`\nolimits`, \x`\displaylimits` used before exponents and scripts constructors (`^`\c7 and `_`\c8) declare an exception from this rule. * \i eqno `$$<formula>\eqno<mark>$$` puts the `<mark>` to the right margin as `\llap{$<mark>$}`. Analogously, \i leqno `$$<formula>\leqno<mark>$$` puts it to the left margin. * \x`\mkern`, \x`\mskip` work like `\kern`, `\hskip`, but dimensions are set in `mu`=1/18`em` units. * \x`\nonscript` ignores following skip command if it is used in <S> or <SS> style. \enditems \noindent {\bf Commands for setting math codes and math-family fonts}\par\nobreak\medskip Each character used in math mode must have its {\em math-code}. It includes {\em class} of the character and how the glyph of the character should be printed. The class is one of this: 0=Ord, 1=Op, 2=Bin, 3=Rel, 4=Open, 5=Close, 6=Punct, and it affects spacing between objects, super/sub/script behavior etc. The glyph for printing the character is saved in a {\em math-family font} at its {\em slot}. Each math-family font has an assigned number using `\textfont`, `\scriptfont` and `\scriptscriptfont` primitives. When old 7bit \TeX/ fonts are used, then the whole set of math characters is divided to more math-family fonts, each of them has its own number. When Unicode math is used then all math characters are stored in a single font and we (almost) never need to use more than single math-family font with a single number. The format must specify the math-code (i.e. class, math-family font number and slot) for all characters used in math mode by following primitives. The `<math-code>` mentioned below is a single 15bit number mostly used in hexadecimal form with four digits: `"<d1><d2><d3><d4>`, where `<d1>` is the class, `<d2>` is the math-family font number and `<d3><d4>` is the slot. \begitems * \i mathcode `\mathcode <num>=<math-code>` sets the math-code for the character given by its <num> ASCII code. The `<num>` is 8bit number. * \i mathchardef `\mathchardef <sequence>=<math-code>` declares math-code for given `<sequence>`. When the `<sequence>` is used in math mode then it behaves as a single object equal to a real single character with its `<math-code>`. * \i textfont `\textfont<num>=<font>` declares math-family font `<num>` as `<font>` for normal size characters. The `<font>` is a font selector given previously by `\font` primitive. * \i scriptfont `\scriptfont<num>=<font>` declares math-family font `<num>` as `<font>` for script size. * \i scriptscriptfont `\scriptscriptfont<num>=<font>` declares math-family font `<num>` as `<font>` for script-in-script size. \enditems Unicode values can be set in \XeTeX/ and \LuaTeX/: \begitems * \*\*\*\i Umathcode `\Umathcode<num>=<class><math-family><slot>` sets the math-code for a character given by its Unicode <num>. The math-code is presented by three independent numbers. * \*\*\*\i Umathchardef `\Umathchardef<sequence>=<class><math-family><slot>` declares `<sequence>` as a math object with the given math-code. \enditems The scalable parentheses used after `\left`, `\right`, `\middle` must have its delimiter-code `<del-code>`. It is a 24bit number. When the hexadecimal form `"<d1><d2><d3><d4><d5><d6>` of this number is used then it gives math-family font number `<d1>` and slot `<d2><d3>` for basic size (typically a normal text font) and math-family font number `<d4>` and slot `<d5><d6>` for the first successor of \"parentheses chain" implemented in the font (typically a special font). \begitems * \i delcode `\delcode<num>=<del-code>` sets the delimiter-code for the ASCII character `<num>`. * \*\*\*\i Udelcode `\Udelcode<num>=<math-family><slot>` sets the delimiter-code for the Unicode character `<num>` when a Unicode math font is loaded. The font must implement the \"parentheses chain" at the `<slot>` directly, we needn't to distinguish the basic size and the first successor. \enditems \noindent{\bf Commands for using math-codes directly in math mode} \begitems * \i mathchar `\mathchar<math-code>` prints a math object given by math-code. * \*\*\*\i Umathchar `\Umathchar<class><math-family><slot>` prints a math object given by math-code. * \i mathaccent `\mathaccent<math-code><object>` prints an accent above `<object>` given by its math-code. The `<object>` can be single math object or `{<math formula>}`. * \*\*\*\i Umathaccent `\Umathaccent<keyword><class><math-family><slot><object>` creates an accent over `<object>` given by its math-code. The accent is stretchable (relative to the width of the `<object>`) by default and if the font implements the \"accents chain" at the `<slot>`. The optional `<keyword>` is `fixed` (do not stretch the accent) or `bottom` (place the accent to the bottom of the `<object>`). * \i delimiter `\delimiter<del-code>` prints a given delimiter, can be used after `\left`, `\right`, `\middle`. The `<del-code>` can have seven hexadecimal digits, first of them is class, others give normal `<del-code>`. The class is used if the `\delimiter` doesn't follow `\left`, `\right`, `\middle`. * \*\*\*\i Udelimiter `\Udelimiter<class><math-family><slot>` behaves as a character with given delimiter-code (after `\left`, `\right`) or as a normal math character with its `<class>` (in other cases). * \i radical `\radical<radical-code><object>` creates radical symbol over `<object>`. The `<radical-code>` is interpreted as `<del-code>`, i.e.\ the first font must include the basic size and the second font must implement the \"radicals chain". * \*\*\*\i Uradical `\Uradical<math-family><slot><object>` creates radical symbol over `<object>`. The Unicode math font must implement the \"radicals chain" at the `<slot>`. \enditems \sec[plain] Summary of plain \TeX/ macros \noindent{\bf Allocators} \begitems * \x`\newcount`, \x`\newdimen`, \x`\newskip`, \x`\newmuskip`, \x`\newtoks` followed by a `|<control sequence>;` allocate a new register of the given type and set it as the `|<control sequence>;`. \x`\newbox`, \x`\newread`, \x`\newwrite` followed by a `|<control sequence>;` allocate a new address to given data (to a box register or to a file descriptor) and set it as the `|<control sequence>;`. All these allocation macros are declared as `\outer` in plain \TeX/, unfortunately. This brings problems when you need to use them in skipped text or in macros (in `<replacement text>` for example). Use `\csname newdimen\endcsname \yoursequence` in such cases. * \i newif `\newif|<control sequence>;` sets the `|<control sequence>;` as a boolean variable. It must begin with `if`; for example `\newif\ifsomething`. Then you can set values by `\somethingtrue` or `\somethingfalse` and you can use this variable by `\ifsoemthing` which behaves like other `\if`\code{*} primitive commands. \enditems \goodbreak\noindent{\bf Vertical skips} \begitems * \x`\bigskip` does \x`\vskip` by one line, \x`\medskip` does `\vskip` by one half of a line and \x`\smallskip` does the vertical skip by one quarter of a line. The registers \x`\bigskipamount`, \x`\medskipamount` and \x`\smallskipamount` are allocated for this purpose. * \x`\nointerlineskip` ignores the \x`\baselineskip` rule %(see section~\ref[reg]) for the following box in the current vertical list. This box is appended immediately after the previous box. {\noda\x`\offinterlineskip`} ignores the \x`\baselineskip` rule for all following boxes until the current group is closed. * All vertical glues at the top of the page inserted by \x`\vskip` are ignored. Macro \x`\vglue` behaves like the `\vskip` primitive command but its glue is not ignored at the top of the page. * Sometimes we must switch off the \x`\baselineskip` rule (by the \x`\offinterlineskip` macro for example). This is common in tables. But we need to keep the baseline distances equal. Then the \x`\strut` can be inserted on each line. It is an invisible box with zero width and with height+depth=`\baselineskip`. * \x`\normalbaselines` sets the registers for vertical placement \x`\baselineskip`, \x`\lineskip` and \x`\lineskiplimit` to default values given by the format. The user can set other values for a while and then he/she can restore `\normalbaselines`. \enditems \noindent{\bf Penalties} \par\nobreak\medskip\nobreak \begitems * \x`\break` puts penalty $-10000$, so a line/page break is forced here. \x`\nobreak` puts penalty 10000, so a line/page break is disabled here. It should be specified before a glue, which is \"protected" by this penalty. \x`\allowbreak` puts penalty 0; it allows breaking similar to a normal space. * \x`\goodbreak` puts penalty $-500$ in vertical mode, this is a \"recommended" point for a page break. * \x`\filbreak` breaks the page only if it is \"almost full" or if a big object (that doesn't fit the current page) follows. The bottom of such a page is filled by a vertical glue, i.e.\ the default typographical rule about equal positions of all bottoms of common pages is broken here. * \x`\eject` puts penalty $-10000$ in the vertical list, i.e.\ it breaks the page. \enditems \noindent{\bf Miscellaneous macros}\par\nobreak\medskip\nobreak \begitems * \i magstep `\magstep<number>` expands to a magnification factor $1.2^x$ where $x$ is the given `<number>`. This follows old typographical traditions that all sizes (of fonts) are distinguished by factors 1, 1.2, 1.44, etc. For example, `\magstep2` expands to 1440, because $1.2^2=1.44$ and 1000 is factor 1:1 in \TeX/. The \x`\magstephalf` macro expands to 1095 which corresponds to $1.2^{(1/2)}$. * \x`\nonfrenchspacing` sets special space factor codes (bigger spaces after periods, commas, semicolons, etc.). This follows English typographical traditions. \x`\frenchspacing` sets all space factors as 1:1 (usable for non English texts). * \x`\endgraf` is equivalent to \x`\par`; \x`\bgroup` and \x`\egroup` are equivalents to `{`\c1 and `}`\c2. * \x`\space` expands to space, \x`\empty` is an empty macro and \x`\null` is an empty \i hbox `\hbox{}`. * \x`\quad` is horizontal space 1\,em (size of the font), \x`\qquad` is double `\quad`, \x`\enspace` is kern 0.5\,em, \x`\thinspace` is kern 1/6\,em, and \x`\negthinspace` makes kern $-$1/6\,em. * \i loop \i repeat `\loop` \z`<body 1><if condition><body 2>\repeat` repeats \z`<body 1>` and \z`<body 2>` in a loop until the `<if condition>` returns false. Then \z`<body 2>` is not processed and the loop is finished. * \x`\leavevmode` opens a paragraph like `\indent` but it does nothing if the horizontal mode is already in effect. * \i line \z`\line{<text>}` creates a box of line width (which is \x`\hsize`). \x`\leftline`, \x`\rightline`, \x`\centerline` do the same as `\line` but \z`<text>` is shifted left / right / is centered. * \i rlap \z`\rlap{<text>}` makes a box of zero size, the \z`<text>` is stuck out to the right. \i llap \z`\llap{<text>}` does the same and the \z`<text>` is pushed left. * \x`\ialign` is equal to `\halign` but the values of the registers used by `\halign` are set to default. * \x`\hang` starts the paragraph where all lines (except for the first) are indented by `\parindent`. * \i textindent `\texindent{<mark>}` starts a paragraph with `\llap{<mark>}`. * \i item `\item{<mark>}` starts the paragraph with `\hang` and with `\llap{<mark>}`. Usable for item lists. \i itemitem `\itemitem{<mark>}` can be used for the second level of items. * \x`\narrower` sets wider margins for paragraphs (`\parindent` is appended to both sides); i.e.\ the paragraphs are narrower. * \x`\raggedright` sets the paragraph shape with the ragged right margin. \x`\raggedbottom` sets the page-setting shape with the ragged bottoms. * \i phantom `\phantom{<text>}` prints empty box with dimensions like `\hbox{<text>}`. \i vphantom `\vphantom{<text>}`, \i hphantom `\hphantom{<text>}` does the same but the result of `\vphantom` sets its width to zero, the result of `\hphantom` sets its height plus depth to zero. \i smash `\smash{<text>}` prints `\hbox{<text>}` but height plus depth is set to zero. In math mode, these commands keep the current math style. \enditems \noindent{\bf Floating objects} \par\nobreak\medskip\nobreak \begitems * \i footnote `\footnote{<mark>}`\z`{<text>}` creates a footnote with given `<mark>` and \z`<text>`. * \i midinsert \i topinsert \i endinsert `\topinsert<object>\endinsert` creates the `<object>` as a \ii floating~object {\em floating object}. It is printed at the top of the current page or on the next page. `\midinsert<object>\endinsert` does the same as `\topinsert` but it tries if the `<object>` fits on the current page. If it is true then it is printed to its current position; no floating object is created. \enditems \noindent{\bf Controlling of input, output} \begitems * \x`\obeyspaces` sets the space as normal, i.e.\ it deactivates special treatment of spaces by the tokenizer: more spaces will be more spaces and spaces at the beginning of the line are not ignored. * \x`\obeylines` sets the end of each line as `\par`. Each line in the input is one paragraph in the output. * \x`\bye` finalizes the last page (or last pages if more floating objects must be printed) and terminates the \TeX/ job. The \x`\end` primitive command does the same but without worrying about floating objects. \enditems \filbreak \noindent{\bf Macros used in math modes} \begitems * Spaces in math mode are \ii -comma `\,` (thin space), \ii -greater `\>` (medium space) \ii -column `\;` (thick space, but still small), \ii -exclam `\!` (negative thin space). * \i choose `{<above>\choose<below>}` creates a combination number with brackets around it. * \i sqrt `\sqrt{<math list>}` creates the square root symbol with the `<math list>` under it. * \i root \z`\root<n>\of{<math list>}` creates a general root symbol with the order of the root `<n>`. * \i cases {\def<#1>{\,$\langle#1\rangle$\,}% \z`\cases{<case 1>&<condition 1>\cr...\cr<case n>&<condition n>}` creates a list of variants (preceded by a brace $\{$) in math mode. * \i matrix \z`\matrix{<a>&<b>...&<e>\cr...\cr<u>&<v>...&<z>}` creates a matrix of given values in math mode (without brackets around it). `\pmatrix{<data>}` does the same but with (). * \i displaylines `$$\displaylines{<formula 1>\cr...\cr<formula n>}$$` prints multiple (centered) formulae in display mode. * \i eqalign `$$\eqalign{<form.1 left>&<form.1 right>\cr...\cr<form.n left>&<form.n right>}$$`\hfil\break prints multiple formulae aligned by `&` character in display mode. * \x`\eqalignno` behaves like `\eqalign` but a second `&` followed by a `<mark>` can be in some lines. These lines place the `<mark>` in the right margin. \x`\leqalignno` does the same as `\eqalignno` but `<mark>` is put to the left margin. } * \i mathpalette `\mathpalette\macro{<text>}` runs `\macro<style primitive>{<text>}`. Your `\macro` can re-set current math style using its `#1` parameter. Example: `\def\macro#1#2{\hbox{$#1#2$}}`. \enditems \raggedbottom \vfil\break \nonum\sec[index] Index \iis LaTeX/macros {\LaTeX{} macros} \iis plain~TeX/macros {plain \TeX{} macros} \iis OpTeX {\OpTeX} \iis pdfTeX {pdf\TeX} \iis luaTeX {Lua\TeX} \iis XeTeX {\XeTeX} \iis TeX/engines {\TeX{} engines} \iis -percent {{\code{\\\%}}} \iis -at {{\code{\\\&}}} \iis -dollar {{\code{\\\$}}} \iis -hash {{\code{\\\#}}} \iis -space {{\code{\\}{\tt\char9251}}} \iis -italiccorr {{\code{\\/}}} \iis -hyphen {{\code{\\-}}} \iis TeXlive {\TeX{}live} \iis baselineskiprule {\code{\\baselineskip} rule} \iis -comma {{\code{\\,}}} \iis -greater {{\code{\\>}}} \iis -column {{\code{\\;}}} \iis -exclam {{\code{\\!}}} \iis plain~TeX {plain \TeX} {\let\Blue=\relax \typosize[9/11] \preprocessindex \begmulti 3 \makeindex \endmulti } \normalbottom \vfill \noindent Petr Olšák {\tt petr@olsak.net}\nl Czech Technical University in Prague\nl Version of the text: 0.10 (\the\year-\thed\month-\thed\day) \break \bye