from small one page howto to huge articles all in one place

search text in:




Other .linuxhowtos.org sites:gentoo.linuxhowtos.org



Last additions:
using iotop to find disk usage hogs

using iotop to find disk usage hogs

words:

887

views:

209583

userrating:


May 25th. 2007:
Words

486

Views

258591

why adblockers are bad


Workaround and fixes for the current Core Dump Handling vulnerability affected kernels

Workaround and fixes for the current Core Dump Handling vulnerability affected kernels

words:

161

views:

149881

userrating:


April, 26th. 2006:

Druckversion
You are here: manpages





groff_char

Section: Environments, Tables, and Troff Macros (7)
Updated: 2 July 2023
Index Return to Main Contents
 

Name

groff_char - GNU roff special character and glyph repertoire
          .nr d-fallback 1 
          .nr d-fallback 1   

Description

The GNU roff typesetting system has a large glyph repertoire suitable for production of varied literary, professional, technical, and mathematical documents. groff works with characters; an output device renders glyphs. groff's input character set is restricted to that defined by the standards ISO Lati-1 (ISO 885-1) and CCSID [lq]code page[rq] 1047 (an EBCDIC arrangement of Lati-1). For ease of document maintenance in UT-8 environments, it is advisable to use only the Unicode basic Latin code points, a subset of all of the foregoing historically referred to as U-ASCII, which has only 94 visible, printable code points. In groff, these are termed ordinary characters. Often, many more are desired in output. AT&T troff in the 1970s faced a similar problem: the available typesetter's glyph repertoire differed from that of the computers that controlled it. troff's solution was a form of escape sequence known as a special character to access several dozen additional glyphs available in the fonts prepared for mounting in the phototypesetter. These glyphs were mapped onto a tw-character name space for a degree of mnemonic convenience; for example, the escape sequence (aa encoded an acute accent and (sc a section sign. groff has lifted historical roff limitations on special character name lengths, but recognizes and retains compatibility with the historical names. groff expands the lexicon of glyphs available by name and permits users to define their own special character escape sequences with the char request. Special character names are groff identifiers; see section [lq]Identifiers[rq] in Our discussion uses the terms [lq]glyph name[rq] and [lq]special character name[rq] interchangeably; we assume no character translations or redefinitions. This document lists all of the glyph names predefined by groff's font description files and presents the systematic notation by which it enables access to arbitrary Unicode code points and construction of composite glyphs. Glyphs listed may be unavailable, or may vary in appearance, depending on the output device and font chosen when the page was formatted. This page was rendered for device .T] using font .fn]. A few escape sequences that are not groff special characters also produce glyphs; these exist for syntactical or historical reasons. [aq], [ga], -, and _ are translated on input to the special character escape sequences [aa], [ga], [-], and [ul], respectively. Others include , . (backslas-dot), and e; see A small number of special characters represent glyphs that are not encoded in Unicode; examples include the baseline rule [ru] and the Bell System logo [bs]. In groff, you can test output device support for any character (ordinary or special) with the conditional expression operator [lq]c[rq].
.ie c [bs] {Welcome to the [bs] Bell System; did you get the Wehrmacht helmet or the Death Star?} .el No Bell System logo.
For brevity in the remainder of this document, we shall refer to systems conforming to the ISO 646:1991 IRV, ISO 8859, or ISO 10646 ([lq]Unicode[rq]) character encoding standards as [lq]ISO[rq] systems, and those employing IBM code page 1047 as [lq]EBCDIC[rq] systems. That said, EBCDIC systems that support groff are known to also support UT-8. While groff accepts eigh-bit encoded input, not all such code points are valid as input. On ISO platforms, character codes 0, 11, 13[en]31, and 128[en]159 are invalid. (This is all C0 and C1 controls except for SOH through LF [Control+A to Control+J], and FF [Control+L].) On EBCDIC platforms, 0, 8[en]9, 11, 13[en]20, 23[en]31, and 48[en]63 are invalid. Some of these code points are used by groff for internal purposes, which is one reason it does not support UT-8 natively.  

Fundamental character set

The ordinary characters catalogued above, plus the space, tab, newline, and leader (Control+A), form the fundamental character set for groff input; anything in the language, even over one million code points in Unicode, can be expressed using it. On ISO systems, code points in the range 33[en]126 comprise a common set of printable glyphs in all of the aforementioned ISO character encoding standards. It is this character set and (with some noteworthy exceptions) the corresponding glyph repertoire for which AT&T troff was implemented. On EBCDIC systems, printable characters are in the range 66[en]201 and 203[en]254; those without counterparts in the ISO range 33[en]126 are discussed in the next subsection. All of the following characters map to glyphs as you would expect.
! # $ % & ( ) * + , . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ ] _
a b c d e f g h i j k l m n o p q r s t u v w x y z { | }
The remaining ordinary characters surprise computing professionals and others intimately familiar with the ISO character encodings. The developers of AT&T troff chose mappings for them that would be useful for typesetting technical literature in a broad range of scientific disciplines: Bell Labs used the system for preparation of AT&T's patent filings with the U.S. government. Further, the prevailing character encoding standard in the 1970s, USAS X3.-1968 ([lq]ASCII[rq]), deliberately supported semantic ambiguity at some code points, and outright substitution at several others, to suit the localization demands of various national standards bodies. The table below presents the seven exceptional code points with their typical keycap engravings, their glyph mappings and semantics in roff systems, and the escape sequences producing the Unicode basic Latin character they replace. The first, the neutral double quote, is a partial exception because it does represent itself, but since the roff language also uses it to quote macro arguments, groff supports a special character escape sequence as an alternative form so that the glyph can be easily included in macro arguments without requiring the user to master the quoting rules that AT&T troff required in that context. (Some requests, like ds, also treat [dq] no-literally.) Furthermore, not all of the special character escape sequences are portable to AT&T troff and all of its descendants; these groff extensions are presented using its special character form [rs][], whereas portable special character escape sequences are shown in the traditional [rs]( form. [rs]- and [rs]e are portable to all known troffs. [rs]e means [lq]the glyph of the current escape character[rq]; it therefore can produce unexpected output if the ec request is used. On devices with a limited glyph repertoire, glyphs in the [lq]keycap[rq] and [lq]appearance[rq] columns on the same row of the table may look identical; except for the neutral double quote, this will not be the case on mor-capable devices. Review your document using as many different output devices as possible.
KeycapAppearance and meaningSpecial character and meaning

"" neutral double quoteCR][rs][dq]] neutral double quote
[aq][cq] closing single quoteCR][rs][aq]] neutral apostrophe
-- hyphenCR][rs-] or CR][rs]-]] minus sign/Unix dash
[rs](escape character)CR][rs]e] or CR][rs][rs]] reverse solidus
[ha][u02C6] modifier circumflexCR][rs](ha] circumflex/caret/[lq]hat[rq]
[ga][oq] opening single quoteCR][rs](ga] grave accent
[ti][u02DC] modifier tildeCR][rs](ti] tilde
The hyphe-minus is a particularly unfortunate case of overloading. Its awkward name in ISO 8859 and later standards reflects the many distinguishable purposes to which it had already been put by the 1980s, including a hyphen, a minus sign, and (alone or in repetition) dashes of varying widths. For best results in roff systems, use the [lq]-[rq] character in input outside an escape sequence only to mean a hyphen, as in the phrase [lq]lon-term[rq]. For a minus sign in running text or a Unix comman-line option dash, use [rs]- (or [rs][-] in groff if you find it helps the clarity of the source document). (Another minus sign, for use in mathematical equations, is available as [rs][mi]). AT&T troff supported e-dashes as [rs](em, as does groff. The special character escape sequence for the apostrophe as a neutral single quote is typically needed only in technical content; typing words like [lq]can't[rq] and [lq]Anne's[rq] in a natural way will render correctly, because in ordinary prose an apostrophe is typeset either as a closing single quotation mark or as a neutral single quote, depending on the capabilities of the output device. By contrast, special character escape sequences should be used for quotation marks unless portability to limited or historical troff implementations is necessary; on those systems, the input convention is to pair the grave accent with the apostrophe for single quotes, and to double both characters for double quotes. AT&T troff defined no special characters for quotation marks or the apostrophe. Repeated single quotes ([oq][oq]thus[cq][cq]) will be visually distinguishable from double quotes ([lq]thus[rq]) on terminal devices, and perhaps on others (depending on the font selected).
AT&T I]troff] inputrecommended I]groff] input

A Winter[aq]s TaleA Winter[aq]s Tale


[ga]U.K. outer quotes[aq][rs][oq]U.K. outer quotes[rs][cq]


[ga]U.K. [ga][ga]inner[aq][aq] quotes[aq][rs][oq]U.K. [rs][lq]inner[rs][rq] quotes[rs][cq]


[ga][ga]U.S. outer quotes[aq][aq][rs][lq]U.S. outer quotes[rs][rq]


[ga][ga]U.S. [ga]inner[aq] quotes[aq][aq][rs][lq]U.S. [rs][oq]inner[rs][cq] quotes[rs][rq]


If you frequently require quotation marks in your document, see if the macro package you're using supplies strings or macros to facilitate quotation, or define them yourself (except in man pages). Using Unicode basic Latin characters to compose boxes and lines is il-advised. roff systems have special characters for drawing horizontal and vertical lines; see subsection [lq]Rules and lines[rq] below. Preprocessors like and draw boxes and will produce the best possible output for the device, falling back to basic Latin glyphs only when necessary.  

Eigh-bit encodings and Lati-1 supplement

ISO 646 is a seve-bit code encoding 128 code points; eigh-bit codes are twice the size. ISO 885-1 and code page 1047 allocated the additional space to what Unicode calls [lq]C1 controls[rq] (control characters) and the [lq]Lati-1 supplement[rq]. The C1 controls are neither printable nor usable as groff input. Two Lati-1 supplement characters are handled specially on input. troff never produces them as output.
NBSP
encodes a n-break space; it is mapped to [rs][ti], the adjustable no-breaking space escape sequence.
SHY
encodes a soft hyphen; it is mapped to [rs]%, the hyphenation control escape sequence. The remaining characters in the Lati-1 supplement represent themselves. Although they can be specified directly with the keyboard on systems configured to use Lati-1 as the character encoding, it is more portable, both to other roff systems and to UT-8 environments, to use their special character escape sequences, shown below. The glyph descriptions we use are no-standard in some cases, for brevity.
[r!]  [r!]inverted exclamation mark [~N][[ti]N]  N tilde 
[ct]  [ct]cent sign [`O][[ga]O]  O grave 
[Po]  [Po]pound sign ['O][[aq]O]  O acute 
[Cs]  [Cs]currency sign [^O][[ha]O]  O circumflex 
[Ye]  [Ye]yen sign [~O][[ti]O]  O tilde 
[bb]  [bb]broken bar [:O][:O]  O dieresis 
[sc]  [sc]section sign [mu][mu]  multiplication sign 
[ad]  [ad]dieresis accent [/O][/O]  O slash 
[co]  [co]copyright sign [`U][[ga]U]  U grave 
[Of]  [Of]feminine ordinal indicator ['U][[aq]U]  U acute 
[Fo]  [Fo]left double chevron [^U][[ha]U]  U circumflex 
[no]  [no]logical not [:U][:U]  U dieresis 
[rg]  [rg]registered sign ['Y][[aq]Y]  Y acute 
[-]  [-]macron accent [TP][TP]  uppercase thorn 
[de]  [de]degree sign [ss][ss]  lowercase sharp s 
[-]  [-]plu-minus [`a][[ga]a]  a grave 
[S2]  [S2]superscript two ['a][[aq]a]  a acute 
[S3]  [S3]superscript three [^a][[ha]a]  a circumflex 
[aa]  [aa]acute accent [~a][[ti]a]  a tilde 
[mc]  [mc]micro sign [:a][:a]  a dieresis 
[ps]  [ps]pilcrow sign [oa][oa]  a ring 
[pc]  [pc]centered period [ae][ae]  ae ligature 
[ac]  [ac]cedilla accent [,c][,c]  c cedilla 
[S1]  [S1]superscript one [`e][[ga]e]  e grave 
[Om]  [Om]masculine ordinal indicator ['e][[aq]e]  e acute 
[Fc]  [Fc]right double chevron [^e][[ha]e]  e circumflex 
[14]  [14]one quarter symbol [:e][:e]  e dieresis 
[12]  [12]one half symbol [`i][[ga]i]  i grave 
[34]  [34]three quarters symbol ['i][[aq]i]  e acute 
[r?]  [r?]inverted question mark [^i][[ha]i]  i circumflex 
[`A]  [[ga]A]A grave [:i][:i]  i dieresis 
['A]  [[aq]A]A acute [Sd][Sd]  lowercase eth 
[^A]  [[ha]A]A circumflex [~n][[ti]n]  n tilde 
[~A]  [[ti]A]A tilde [`o][[ga]o]  o grave 
[:A]  [:A]A dieresis ['o][[aq]o]  o acute 
[oA]  [oA]A ring [^o][[ha]o]  o circumflex 
[AE]  [AE]AE ligature [~o][[ti]o]  o tilde 
[,C]  [,C]C cedilla [:o][:o]  o dieresis 
[`E]  [[ga]E]E grave [di][di]  division sign 
['E]  [[aq]E]E acute [/o][/o]  o slash 
[^E]  [[ha]E]E circumflex [`u][[ga]u]  u grave 
[:E]  [:E]E dieresis ['u][[aq]u]  u acute 
[`I]  [[ga]I]I grave [^u][[ha]u]  u circumflex 
['I]  [[aq]I]I acute [:u][:u]  u dieresis 
[^I]  [[ha]I]I circumflex ['y][[aq]y]  y acute 
[:I]  [:I]I dieresis [Tp][Tp]  lowercase thorn 
-D]  -D]uppercase eth [:y][:y]  y dieresis 
 

Special character escape forms

Glyphs that lack a character code in the basic Latin repertoire to directly represent them are entered by one of several special character escape forms. Such glyphs can be simple or composite, and accessed either by name or numerically by code point. Code points and combining properties are determined by character encoding standards, whereas glyph names as used here originated in AT&T troff special character escape sequences. Predefined glyph names use only characters in the basic Latin repertoire.
[rs](gl
is a special character escape sequence for the glyph with the tw-character name gl. This is the original syntax form supported by AT&T troff. The acute accent, [rs](aa, is an example.
[rs]C[aq]glyp-name[aq]
is a special character escape sequence for glyp-name, which can be of arbitrary length. The delimiter, shown here as a neutral apostrophe, can be any character not occurring in glyp-name. This syntax form was introduced in later versions of AT&T devic-independent troff. The foregoing acute accent example can be expressed as [rs]C[aq]aa[aq].
[rs][glyp-name]
is a special character escape sequence for glyp-name, which can be of arbitrary length but must not contain a closing square bracket [lq]][rq]. (No glyph names predefined by groff employ [lq]][rq].) The foregoing acute accent example can be expressed in groff as [rs][aa]. [rs]C[aq]c[aq] and [rs][c] are not synonyms for the ordinary character [lq]c[rq], but request the special character named [lq][rs]c[rq]. For example, [lq][rs][a][rq] is not [lq]a[rq], but rather a special character with the internal glyph name (used in font description files and diagnostic messages) [rs]a, which is typically undefined. The only such glyph name groff predefines is the minus sign, which can therefore be accessed as [rs]C[aq]-[aq] or [rs][-].
[rs][bas-char composit-1 composit-2~
...~composit-n] is a composite glyph. Glyphs like a lowercase [lq]e[rq] with an acute accent, as in the word [lq]caf[e aa][rq], can be expressed as [rs][e aa]. See subsection [lq]Accents[rq] below for a table of combining glyph names. Unicode encodes far more characters than groff has glyph names for; special character escape forms based on numerical code points enable access to any of them. Frequently used glyphs or glyph combinations can be stored in strings, and new glyph names can be created ad hoc with the char request; see
[rs][unnnn
[n[n]]] is a Unicode numeric special character escape sequence. Any Unicode code point can be accessed with four to six hexadecimal digits, with hexadecimal letters accepted in uppercase form only. Thus, [rs][u02DA] accesses the (spacing) ring accent, producing [lq][u02DA][rq]. Unicode code points can be composed as well; when they are, GNU troff requires NFD (Normalization Form D), where all Unicode glyphs are maximally decomposed. (Exception: precomposed characters in the Lati-1 supplement described above are also accepted. Do not count on this exception remaining in a future GNU troff that accepts UT-8 input directly.) Thus, GNU troff accepts [lq]caf[rs][[aq]e][rq], [lq]caf[rs][e~aa][rq], and [lq]caf[rs][u0065_0301][rq], as ways to input [lq]caf['e][rq]. (Due to its legacy -bit encoding compatibility, at present it also accepts [lq]caf[rs][u00E9][rq] on ISO Lati-1 systems.)
[rs][ubas-char
[_combinin-component]...] constructs a composite glyph from Unicode numeric special character escape sequences. The code points of the base glyph and the combining components are each expressed in hexadecimal, with an underscore (_) separating each component. Thus, [rs][u006E_0303] produces [lq][u006E_0303][rq].
[rs][charnnn]
expresses an eigh-bit code point where nnn is the code point of the character, a decimal number between 0 and~255 without leading zeroes. This legacy numeric special character escape sequence is used to map characters onto glyphs via the trin request in macro files loaded by
 

Glyph tables

In this section, groff's glyph name repertoire is presented in tabular form. The meanings of the columns are as follows.
Output
shows the glyph as it appears on the device used to render this document; although it can have a notably different shape on other devices (and is subject to use-directed translation and replacement), groff attempts reasonable equivalency on all output devices.
Input
shows the groff character (ordinary or special) that normally produces the glyph. Some code points have multiple glyph names.
Unicode
is the code point notation for the glyph or combining glyph sequence as described in subsection [lq]Special character escape forms[rq] above. It corresponds to the standard notation for Unicode short identifiers such that groff's unnnn is equivalent to Unicode's U+nnnn.
Notes
describes the glyph, elucidating the mnemonic value of the glyph name where possible.
A plus sign [lq]+[rq] indicates that the glyph name appears in the AT&T troff user's manual, CSTR~#54 (1992 revision). When using the AT&T special character syntax [rs](xx, widespread portability can be expected from such names.
Entries marked with [lq]***[rq] denote glyphs used for mathematical purposes. On typesetting devices, such glyphs are typically drawn from a special font (see Often, such glyphs lack bold or italic style forms or have metrics that look incongruous in ordinary prose. A few which are not uncommon in running text have [lq]text variants[rq], which should work better in that context. Conversely, a handful of glyphs that are normally drawn from a text font may be required in mathematical equations. Both sets of exceptions are noted in the tables where they appear ([lq]Logical symbols[rq] and [lq]Mathematical symbols[rq]).
 

Basic Latin

Apart from basic Latin characters with special mappings, described in subsection [lq]Fundamental character set[rq] above, a few others in that range have special character glyph names. These were defined for ease of input on no-U.S. keyboards lacking keycaps for them, or for symmetry with other special character glyph names serving a similar purpose. The vertical bar is overloaded; the [rs][ba] and [rs][or] escape sequences may render differently. See subsection [lq]Mathematical symbols[rq] below for special variants of the plus, minus, and equals signs normally drawn from this range.
OutputInputUnicodeNotes

[dq][dq]u0022neutral double quote

[sh][sh]u0023number sign

[Do][Do]u0024dollar sign

[aq][aq]u0027apostrophe, neutral single quote

[sl][sl]u002Fslash, solidus +

[at][at]u0040at sign

[lB][lB]u005Bleft square bracket

[rs][rs]u005Creverse solidus

[rB][rB]u005Dright square bracket

[ha][ha]u005Ecircumflex, caret, [lq]hat[rq]

[lC][lC]u007Bleft brace

||u007Cbar

[ba][ba]u007Cbar

[or][or]u007Cbitwise or +

[rC][rC]u007Dright brace

[ti][ti]u007Etilde

 

Supplementary Latin letters

Historically, [rs][ss] could be considered a ligature of [lq]sz[rq]. An uppercase form is available as [rs][u1E9E], but in the German language it is of specialized use; [ss] does not normally uppercas-transform to it, but rather to [lq]SS[rq]. [lq]Lowercase f with hook[rq] is also used as a function symbol; see subsection [lq]Mathematical symbols[rq] below.
OutputInputUnicodeNotes

-D]-D]u00D0uppercase eth

[Sd][Sd]u00F0lowercase eth

[TP][TP]u00DEuppercase thorn

[Tp][Tp]u00FElowercase thorn

[ss][ss]u00DFlowercase sharp s

[.i][.i]u0131i without tittle

[.j][.j]u0237j without tittle

[Fn][Fn]u0192lowercase f with hook, function

[/L][/L]u0141L with stroke

[/l][/l]u0142l with stroke

[/O][/O]u00D8O with stroke

[/o][/o]u00F8o with stroke


 

Ligatures and digraphs

OutputInputUnicodeNotes

[ff][ff]u0066_0066ff ligature +

[fi][fi]u0066_0069fi ligature +

[fl][fl]u0066_006Cfl ligature +

[Fi][Fi]u0066_0066_0069ffi ligature +

[Fl][Fl]u0066_0066_006Cffl ligature +

[AE][AE]u00C6AE ligature

[ae][ae]u00E6ae ligature

[OE][OE]u0152OE ligature

[oe][oe]u0153oe ligature

[IJ][IJ]u0132IJ digraph

[ij][ij]u0133ij digraph

 

Accents

Normally, the formatting of a special character advances the drawing position as an ordinary character does. groff's composite request designates a special character as combining. The composite.tmac macro file, loaded automatically by the default troffrc, maps the following special characters to the combining characters shown below. The no-combining code point in parentheses is used when the special character occurs in isolation (compare [lq]caf[rs][e aa][rq] and [lq]caf[rs][aa]e[rq]).
OutputInputUnicodeNotes

[a"][a"]u030B (u02DD)double acute accent

[-][-]u0304 (u00AF)macron accent

[a.][a.]u0307 (u02D9)dot accent

[a^][a[ha]]u0302 (u005E)circumflex accent

[aa][aa]u0301 (u00B4)acute accent +

[ga][ga]u0300 (u0060)grave accent +

[ab][ab]u0306 (u02D8)breve accent

[ac][ac]u0327 (u00B8)cedilla accent

[ad][ad]u0308 (u00A8)dieresis accent

[ah][ah]u030C (u02C7)caron accent

[ao][ao]u030A (u02DA)ring accent

[a~][a[ti]]u0303 (u007E)tilde accent

[ho][ho]u0328 (u02DB)hook accent

 

Accented characters

All of these glyphs can be composed using combining glyph names as described in subsection [lq]Special character escape forms[rq] above; the names below are short aliases for convenience.
OutputInputUnicodeNotes

['A][[aq]A]u0041_0301A acute

['C][[aq]C]u0043_0301C acute

['E][[aq]E]u0045_0301E acute

['I][[aq]I]u0049_0301I acute

['O][[aq]O]u004F_0301O acute

['U][[aq]U]u0055_0301U acute

['Y][[aq]Y]u0059_0301Y acute

['a][[aq]a]u0061_0301a acute

['c][[aq]c]u0063_0301c acute

['e][[aq]e]u0065_0301e acute

['i][[aq]i]u0069_0301i acute

['o][[aq]o]u006F_0301o acute

['u][[aq]u]u0075_0301u acute

['y][[aq]y]u0079_0301y acute



[:A][:A]u0041_0308A dieresis

[:E][:E]u0045_0308E dieresis

[:I][:I]u0049_0308I dieresis

[:O][:O]u004F_0308O dieresis

[:U][:U]u0055_0308U dieresis

[:Y][:Y]u0059_0308Y dieresis

[:a][:a]u0061_0308a dieresis

[:e][:e]u0065_0308e dieresis

[:i][:i]u0069_0308i dieresis

[:o][:o]u006F_0308o dieresis

[:u][:u]u0075_0308u dieresis

[:y][:y]u0079_0308y dieresis



[^A][[ha]A]u0041_0302A circumflex

[^E][[ha]E]u0045_0302E circumflex

[^I][[ha]I]u0049_0302I circumflex

[^O][[ha]O]u004F_0302O circumflex

[^U][[ha]U]u0055_0302U circumflex

[^a][[ha]a]u0061_0302a circumflex

[^e][[ha]e]u0065_0302e circumflex

[^i][[ha]i]u0069_0302i circumflex

[^o][[ha]o]u006F_0302o circumflex

[^u][[ha]u]u0075_0302u circumflex



[`A][[ga]A]u0041_0300A grave

[`E][[ga]E]u0045_0300E grave

[`I][[ga]I]u0049_0300I grave

[`O][[ga]O]u004F_0300O grave

[`U][[ga]U]u0055_0300U grave

[`a][[ga]a]u0061_0300a grave

[`e][[ga]e]u0065_0300e grave

[`i][[ga]i]u0069_0300i grave

[`o][[ga]o]u006F_0300o grave

[`u][[ga]u]u0075_0300u grave



[~A][[ti]A]u0041_0303A tilde

[~N][[ti]N]u004E_0303N tilde

[~O][[ti]O]u004F_0303O tilde

[~a][[ti]a]u0061_0303a tilde

[~n][[ti]n]u006E_0303n tilde

[~o][[ti]o]u006F_0303o tilde



[vS][vS]u0053_030CS caron

[vs][vs]u0073_030Cs caron

[vZ][vZ]u005A_030CZ caron

[vz][vz]u007A_030Cz caron



[,C][,C]u0043_0327C cedilla

[,c][,c]u0063_0327c cedilla



[oA][oA]u0041_030AA ring

[oa][oa]u0061_030Aa ring

 

Quotation marks

The neutral double quote, often useful when documenting programming languages, is also available as a special character for convenient embedding in macro arguments; see subsection [lq]Fundamental character set[rq] above.
OutputInputUnicodeNotes

[Bq][Bq]u201Elow double comma quote

[bq][bq]u201Alow single comma quote

[lq][lq]u201Cleft double quote

[rq][rq]u201Dright double quote

[oq][oq]u2018single opening (left) quote

[cq][cq]u2019single closing (right) quote

[aq][aq]u0027apostrophe, neutral single quote

[dq]"u0022neutral double quote

[dq][dq]u0022neutral double quote

[Fo][Fo]u00ABleft double chevron

[Fc][Fc]u00BBright double chevron

[fo][fo]u2039left single chevron

[fc][fc]u203Aright single chevron

 

Punctuation

The Unicode name for U+00B7 is [lq]middle dot[rq], which is unfortunately confusable with the groff mnemonic for the visually similar but semantically distinct multiplication dot; see subsection [lq]Mathematical symbols[rq] below.
OutputInputUnicodeNotes

[r!][r!]u00A1inverted exclamation mark

[r?][r?]u00BFinverted question mark

[pc][pc]u00B7centered period

[em][em]u2014e-dash +

[en][en]u2013e-dash

[hy][hy]u2010hyphen +

 

Brackets

On typesetting devices, the bracket extensions are fon-invariant glyphs; that is, they are rendered the same way regardless of font (with a drawing escape sequence). On terminals, they are not fon-invariant; groff maps them rather arbitrarily to U+23AA ([lq]curly bracket extension[rq]). In AT&T troff, only one glyph was available to vertically extend brackets, braces, and parentheses: [rs](bv. Not all devices supply bracket pieces that can be piled up with [rs]b due to the restrictions of the escape's piling algorithm. A general solution to build brackets out of pieces is the following macro:
." Make a pile centered vertically 0.5em above the baseline. ." The first argument is placed at the top. ." The pile is returned in string [aq]pile[aq]. .eo .de pile-make . nr pile-wd 0 . nr pile-ht 0 . ds pile-args . . nr pile-# n[.$] . while n[pile-#] { . nr pile-wd (n[pile-wd] >? w[aq]$[n[pile-#]][aq]) . nr pile-ht +(n[rst] - n[rsb]) . as pile-args v[aq]n[rsb]u[aq]" . as pile-args Z[aq]$[n[pile-#]][aq]" . as pile-args v[aq]-n[rst]u[aq]" . nr pile-# -1 . } . . ds pile v[aq](-0.5m + (n[pile-ht]u / 2u))[aq]" . as pile *[pile-args]" . as pile v[aq]((n[pile-ht]u / 2u) + 0.5m)[aq]" . as pile h[aq]n[pile-wd]u[aq]" .. .ec
Another complication is the fact that some glyphs which represent bracket pieces in AT&T troff can be used for other mathematical symbols as well, for example [rs](lf and [rs](rf, which provide the floor operator. Some output devices, such as dvi, don't unify such glyphs. For this reason, the glyphs [rs][lf], [rs][rf], [rs][lc], and [rs][rc] are not unified with simila-looking bracket pieces. In groff, only glyphs with long names are guaranteed to pile up correctly for all devices[em]provided those glyphs are available.
OutputInputUnicodeNotes

[lB][u005Bleft square bracket

[lB][lB]u005Bleft square bracket

[rB]]u005Dright square bracket

[rB][rB]u005Dright square bracket

[lC]{u007Bleft brace

[lC][lC]u007Bleft brace

[rC]}u007Dright brace

[rC][rC]u007Dright brace

[la][la]u27E8left angle bracket

[ra][ra]u27E9right angle bracket

[bv][bv]u23AAbrace vertical extension + ***

[braceex][braceex]u23AAbrace vertical extension



[bracketlefttp][bracketlefttp]u23A1left square bracket top

[bracketleftex][bracketleftex]u23A2left square bracket extension

[bracketleftbt][bracketleftbt]u23A3left square bracket bottom



[bracketrighttp][bracketrighttp]u23A4right square bracket top

[bracketrightex][bracketrightex]u23A5right square bracket extension

[bracketrightbt][bracketrightbt]u23A6right square bracket bottom



[lt][lt]u23A7left brace top +

[lk][lk]u23A8left brace middle +

[lb][lb]u23A9left brace bottom +

[bracelefttp][bracelefttp]u23A7left brace top

[braceleftmid][braceleftmid]u23A8left brace middle

[braceleftbt][braceleftbt]u23A9left brace bottom

[braceleftex][braceleftex]u23AAleft brace extension



[rt][rt]u23ABright brace top +

[rk][rk]u23ACright brace middle +

[rb][rb]u23ADright brace bottom +

[bracerighttp][bracerighttp]u23ABright brace top

[bracerightmid][bracerightmid]u23ACright brace middle

[bracerightbt][bracerightbt]u23ADright brace bottom

[bracerightex][bracerightex]u23AAright brace extension



[parenlefttp][parenlefttp]u239Bleft parenthesis top

[parenleftex][parenleftex]u239Cleft parenthesis extension

[parenleftbt][parenleftbt]u239Dleft parenthesis bottom

[parenrighttp][parenrighttp]u239Eright parenthesis top

[parenrightex][parenrightex]u239Fright parenthesis extension

[parenrightbt][parenrightbt]u23A0right parenthesis bottom


 

Arrows

OutputInputUnicodeNotes

[<-][<-]u2190horizontal arrow left +

->]->]u2192horizontal arrow right +

[<>][<>]u2194bidirectional horizontal arrow

[da][da]u2193vertical arrow down +

[ua][ua]u2191vertical arrow up +

[va][va]u2195bidirectional vertical arrow

[lA][lA]u21D0horizontal double arrow left

[rA][rA]u21D2horizontal double arrow right

[hA][hA]u21D4bidirectional horizontal double arrow

[dA][dA]u21D3vertical double arrow down

[uA][uA]u21D1vertical double arrow up

[vA][vA]u21D5bidirectional vertical double arrow

[an][an]u23AFhorizontal arrow extension

 

Rules and lines

On typesetting devices, the fon-invariant glyphs (see subsection [lq]Brackets[rq] above) [rs][br], [rs][ul], and [rs][rn] form corners when adjacent; they can be used to build boxes. On terminal devices, they are mapped as shown in the table. The Unicod-derived names of these three glyphs are approximations. The input character _ always accesses the underscore glyph in a font; [rs][ul], by contrast, may be fon-invariant on typesetting devices. The baseline rule [rs][ru] is a fon-invariant glyph, namely a rule of on-half em. In AT&T troff, [rs][rn] also served as a one~en extension of the square root symbol. groff favors [rs][radicalex] for this purpose; see subsection [lq]Mathematical symbols[rq] below.
OutputInputUnicodeNotes

||u007Cbar

[ba][ba]u007Cbar

[br][br]u2502box rule +

__u005Funderscore, low line +

[ul][ul]-underrule +

[rn][rn]u203Eoverline +

[ru][ru]-baseline rule +

[bb][bb]u00A6broken bar

[sl]/u002Fslash, solidus +

[sl][sl]u002Fslash, solidus +

[rs][rs]u005Creverse solidus


 

Text markers

OutputInputUnicodeNotes

[ci][ci]u25CBcircle +

[bu][bu]u2022bullet +

[dg][dg]u2020dagger +

[dd][dd]u2021double dagger +

[lz][lz]u25CAlozenge, diamond

[sq][sq]u25A1square +

[ps][ps]u00B6pilcrow sign

[sc][sc]u00A7section sign +

[lh][lh]u261Chand pointing left +

[rh][rh]u261Ehand pointing right +

[at]@u0040at sign

[at][at]u0040at sign

[sh]#u0023number sign

[sh][sh]u0023number sign

[CR][CR]u21B5carriage return

[OK][OK]u2713check mark

 

Legal symbols

The Bell System logo is not supported in groff.
OutputInputUnicodeNotes

[co][co]u00A9copyright sign +

[rg][rg]u00AEregistered sign +

[tm][tm]u2122trade mark sign

[bs][bs]-Bell System logo +


 

Currency symbols

OutputInputUnicodeNotes

[Do]$u0024dollar sign

[Do][Do]u0024dollar sign

[ct][ct]u00A2cent sign +

[eu][eu]u20ACEuro sign

[Eu][Eu]u20ACvariant Euro sign

[Ye][Ye]u00A5yen sign

[Po][Po]u00A3pound sign

[Cs][Cs]u00A4currency sign


 

Units

OutputInputUnicodeNotes

[de][de]u00B0degree sign +

[%0][%0]u2030per thousand, per mille sign

[fm][fm]u2032arc minute sign, foot mark +

[sd][sd]u2033arc second sign

[mc][mc]u00B5micro sign

[Of][Of]u00AAfeminine ordinal indicator

[Om][Om]u00BAmasculine ordinal indicator

 

Logical symbols

The variants of the not sign may differ in appearance or spacing depending on the device and font selected. Unicode does not encode a discrete [lq]bitwise or[rq] sign: on typesetting devices, it is drawn shorter than the bar, about the same height as a capital letter. Terminal devices unify [rs][ba] and [rs][or].
OutputInputUnicodeNotes

[AN][AN]u2227logical and

[OR][OR]u2228logical or

[no][no]u00AClogical not + ***

[tno][tno]u00ACtext variant of B][no]]

[te][te]u2203there exists

[fa][fa]u2200for all

[st][st]u220Bsuch that

[3d][3d]u2234therefore

[tf][tf]u2234therefore

||u007Cbar

[or][or]u007Cbitwise or +

 

Mathematical symbols

[rs][Fn] also appears in subsection [lq]Supplementary Latin letters[rq] above. Observe the two varieties of the plu-minus, multiplication, and division signs; [rs][+-], [rs][mu], and [rs][di] are normally drawn from the special font, but have text font variants. Also be aware of three glyphs available in special font variants that are normally drawn from text fonts: the plus, minus, and equals signs. These variants may differ in appearance or spacing depending on the device and font selected. In AT&T troff, [rs](rn ([lq]root en extender[rq]) served as the horizontal extension of the radical (square root) sign, [rs](sr, and was drawn at the maximum height of the typeface's bounding box; this enabled the special character to double as an overline (see subsection [lq]Rules and lines[rq] above). A contemporary font's radical sign might not ascend to such an extreme. In groff, you can instead use [rs][radicalex] to continue the radical sign [rs][sr]; these special characters are intended for use with text fonts. [rs][sqrt] and [rs][sqrtex] are their counterparts with mathematical spacing.
OutputInputUnicodeNotes

[12][12]u00BDone half symbol +

[14][14]u00BCone quarter symbol +

[34][34]u00BEthree quarters symbol +

[18][18]u215Bone eighth symbol

[38][38]u215Cthree eighths symbol

[58][58]u215Dfive eighths symbol

[78][78]u215Eseven eighths symbol

[S1][S1]u00B9superscript one

[S2][S2]u00B2superscript two

[S3][S3]u00B3superscript three



++u002Bplus

[pl][pl]u002Bspecial variant of plus + ***

--]u002Dminus

[mi][mi]u2212special variant of minus + ***

-+]-+]u2213minu-plus

[-][-]u00B1plu-minus + ***

[t-][t-]u00B1text variant of B][-]]

[md][md]u22C5multiplication dot

[mu][mu]u00D7multiplication sign + ***

[tmu][tmu]u00D7text variant of B][mu]]

[c*][c*]u2297circled times

[c+][c+]u2295circled plus

[di][di]u00F7division sign + ***

[tdi][tdi]u00F7text variant of B][di]]

[f/][f/]u2044fraction slash

**u002Aasterisk

[**][**]u2217mathematical asterisk +



[<=][<=]u2264less than or equal to +

[>=][>=]u2265greater than or equal to +

[<<][<<]u226Amuch less than

[>>][>>]u226Bmuch greater than

==u003Dequals

[eq][eq]u003Dspecial variant of equals + ***

[!=][!=]u003D_0338not equals +

[==][==]u2261equivalent +

[ne][ne]u2261_0338not equivalent

[=~][=[ti]]u2245approximately equal to

[|=][|=]u2243asymptotically equal to +

[ti][ti]u007Etilde +

[ap][ap]u223Csimilar to, tilde operator +

[~~][[ti][ti]]u2248almost equal to

[~=][[ti]=]u2248almost equal to

[pt][pt]u221Dproportional to +



[es][es]u2205empty set +

[mo][mo]u2208element of a set +

[nm][nm]u2208_0338not element of set

[sb][sb]u2282proper subset +

[nb][nb]u2282_0338not subset

[sp][sp]u2283proper superset +

[nc][nc]u2283_0338not superset

[ib][ib]u2286subset or equal +

[ip][ip]u2287superset or equal +

[ca][ca]u2229intersection, cap +

[cu][cu]u222Aunion, cup +



[/_][/_]u2220angle

[pp][pp]u22A5perpendicular

[is][is]u222Bintegral +

[integral][integral]u222Bintegral ***

[sum][sum]u2211summation ***

[product][product]u220Fproduct ***

[coproduct][coproduct]u2210coproduct ***

[gr][gr]u2207gradient +

[sr][sr]u221Aradical sign, square root +

[rn][rn]u203Eoverline +

[radicalex][radicalex]-radical extension

[sqrt][sqrt]u221Aradical sign, square root ***

[sqrtex][sqrtex]-radical extension ***



[lc][lc]u2308left ceiling +

[rc][rc]u2309right ceiling +

[lf][lf]u230Aleft floor +

[rf][rf]u230Bright floor +



[if][if]u221Einfinity +

[Ah][Ah]u2135aleph symbol

[Fn][Fn]u0192lowercase f with hook, function

[Im][Im]u2111blackletter I, imaginary part

[Re][Re]u211Cblackletter R, real part

[wp][wp]u2118Weierstrass p

[pd][pd]u2202partial differential

-h]-h]u210Fh bar

[hbar][hbar]u210Fh bar

 

Greek glyphs

These glyphs are intended for technical use, not for typesetting Greek language text; normally, the uppercase letters have upright shape, and the lowercase ones are slanted.
OutputInputUnicodeNotes

[*A][*A]u0391uppercase alpha +

[*B][*B]u0392uppercase beta +

[*G][*G]u0393uppercase gamma +

[*D][*D]u0394uppercase delta +

[*E][*E]u0395uppercase epsilon +

[*Z][*Z]u0396uppercase zeta +

[*Y][*Y]u0397uppercase eta +

[*H][*H]u0398uppercase theta +

[*I][*I]u0399uppercase iota +

[*K][*K]u039Auppercase kappa +

[*L][*L]u039Buppercase lambda +

[*M][*M]u039Cuppercase mu +

[*N][*N]u039Duppercase nu +

[*C][*C]u039Euppercase xi +

[*O][*O]u039Fuppercase omicron +

[*P][*P]u03A0uppercase pi +

[*R][*R]u03A1uppercase rho +

[*S][*S]u03A3uppercase sigma +

[*T][*T]u03A4uppercase tau +

[*U][*U]u03A5uppercase upsilon +

[*F][*F]u03A6uppercase phi +

[*X][*X]u03A7uppercase chi +

[*Q][*Q]u03A8uppercase psi +

[*W][*W]u03A9uppercase omega +



[*a][*a]u03B1lowercase alpha +

[*b][*b]u03B2lowercase beta +

[*g][*g]u03B3lowercase gamma +

[*d][*d]u03B4lowercase delta +

[*e][*e]u03B5lowercase epsilon +

[*z][*z]u03B6lowercase zeta +

[*y][*y]u03B7lowercase eta +

[*h][*h]u03B8lowercase theta +

[*i][*i]u03B9lowercase iota +

[*k][*k]u03BAlowercase kappa +

[*l][*l]u03BBlowercase lambda +

[*m][*m]u03BClowercase mu +

[*n][*n]u03BDlowercase nu +

[*c][*c]u03BElowercase xi +

[*o][*o]u03BFlowercase omicron +

[*p][*p]u03C0lowercase pi +

[*r][*r]u03C1lowercase rho +

[*s][*s]u03C3lowercase sigma +

[*t][*t]u03C4lowercase tau +

[*u][*u]u03C5lowercase upsilon +

[*f][*f]u03D5lowercase phi +

[*x][*x]u03C7lowercase chi +

[*q][*q]u03C8lowercase psi +

[*w][*w]u03C9lowercase omega +



[+e][+e]u03F5variant epsilon (lunate)

[+h][+h]u03D1variant theta (cursive form)

[+p][+p]u03D6variant pi (similar to omega)

[+f][+f]u03C6variant phi (curly shape)

[ts][ts]u03C2terminal lowercase sigma +


 

Playing card symbols

OutputInputUnicodeNotes

[CL][CL]u2663solid club suit

[SP][SP]u2660solid spade suit

[HE][HE]u2665solid heart suit

[DI][DI]u2666solid diamond suit

 

History

A consideration of the typefaces originally available to AT&T nroff and troff illuminates many conventions that one might regard as idiosyncratic fifty years afterward. (See section [lq]History[rq] of for more context.) The face used by the Teletype Model~37 terminals of the Murray Hill Unix Room was based on ASCII, but assigned multiple meanings to several code points, as suggested by that standard. Decimal 34 ([dq]) served as a dieresis accent and neutral double quotation mark; decimal 39 ([aq]) as an acute accent, apostrophe, and closing (right) single quotation mark; decimal 45 ([-]) as a hyphen and a minus sign; decimal 94 ([ha]) as a circumflex accent and caret; decimal 96 ([ga]) as a grave accent and opening (left) single quotation mark; and decimal 126 ([ti]) as a tilde accent and (with a hal-line motion) swung dash. The Model~37 bore an optional extended character set offering upright Greek letters and several mathematical symbols; these were documented as early as the kbd(VII) man page of the (First Edition) Unix Programmer's Manual.
At the time Graphic Systems delivered the C/A/T phototypesetter to AT&T, the ASCII character set was not considered a standard basis for a glyph repertoire by traditional typographers. In the stock Times roman, italic, and bold styles available, several ASCII characters were not present at all, nor was most of the Teletype's extended character set. AT&T commissioned a [lq]special[rq] font to ensure no loss of repertoire.
A representation of the coverage of the C/A/T's text fonts follows. The glyph resembling an underscore is a baseline rule, and that resembling a vertical line is a box rule. In italics, the box rule was not slanted. We also observe that the hyphen and minus sign were already [lq]d-unified[rq] by the fonts provided; a decision whither to map an input [lq]-[rq] therefore had to be taken.
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9 [fi] [fl] [Fi] [Fl]
! $ % & ( ) [oq] [cq] * +- . , / : ; = ? [ ] [br]
[bu] [sq] [em] [hy] [ru] [14] [12] [34] [de] [dg] [fm] [ct] [rg] [co]
The special font supplied the missing ASCII and Teletype extended glyphs, among several others. The plus, minus, and equals signs appeared in the special font despite availability in text fonts [lq]to insulate the appearance of equations from the choice of standard [read: text] fonts[rq][em]a priority since troff was turned to the task of mathematical typesetting as soon as it was developed. We note that AT&T took the opportunity to d-unify the apostrophe/right single quotation mark from the acute accent (a choice ISO later duplicated in its 8859 series of standards). A slash intended to be mirro-symmetric with the backslash was also included, as was the Bell System logo; we do not attempt to depict the latter.
[*a] [*b] [*g] [*d] [*e] [*z] [*y] [*h] [*i] [*k] [*l] [*m] [*n] [*c] [*o] [*p] [*r] [*s] [ts] [*t] [*u] [*f] [*x] [*q] [*w]
[*G] [*D] [*H] [*L] [*C] [*P] [*S] [*U] [*F] [*Q] [*W]
[dq] [aa] [rs] [ha] [ul] [ga] [ti] [sl] < > { } # @ [pl] [mi] [eq] [**]
[>=] [<=] [==] [~=] [ap] [!=] [ua] [da] [<-] ->] [mu] [di] [-] [if] [pd] [gr] [no] [is] [pt] [sr] [radicalex] [cu] [ca] [sb] [sp] [ib] [ip] [es] [mo]
[sc] [dd] [lh] [rh] [or] [ci] [lt] [lb] [rt] [rb] [lk] [rk] [bv] [lf] [rf] [lc] [rc]
One ASCII character as rendered by the Model 37 was apparently abandoned. That device printed decimal 124 ([or]) as a broken vertical line, like Unicode U+00A6 ([bb]). No equivalent was available on the C/A/T; the box rule [rs][br], brace vertical extension [rs][bv], and [lq]or[rq] operator [rs][or] were used as contextually appropriate. Devices supported by AT&T devic-independent troff exhibited some differences in glyph detail. For example, on the Autologic AP-5 phototypesetter, the square [rs](sq became filled in the Times bold face.  

Files

The files below are loaded automatically by the default troffrc.
/usr/:share/:groff/:1.23.0/:tmac/:composite:.tmac
assigns alternate mappings for identifiers after the first in a composite special character escape sequence. See subsection [lq]Accents[rq] above.
/usr/:share/:groff/:1.23.0/:tmac/:fallbacks:.tmac
defines fallback mappings for Unicode code points such as the increment sign (U+2206) and uppe- and lowercase Roman numerals.
 

Authors

This document was written by James Clark with additions by Werner Lemberg and Bernd Warken revised to use by Eric S. Raymond and largely rewritten by G. Branden Robinson  

See also

Groff: The GNU Implementation of troff, by Trent A. Fisher and Werner Lemberg, is the primary groff manual. Section [lq]Using Symbols[rq] may be of particular note. You can browse it interactively with [lq]info [aq](groff) Using Symbols[aq][rq]. [lq]An extension to the troff character set for Europe[rq], E.G. Keizer, K.J. Simonsen, J. Akkerhuis; EUUG Newsletter, Volume 9, No. 2, Summer 1989 The Unicode Standard
[lq]-bit Character Sets[rq] by Tuomas Salste documents the inherent ambiguity and configurable code points of the ASCII encoding standard. [lq]Nroff/Troff User's Manual[rq] by Joseph F. Ossanna, 1976, AT&T Bell Laboratories Computing Science Technical Report No. 54, features two tables that throw light on the glyph repertoire available to [lq]typesetter roff[rq] when it was first written. Be careful of r-typeset versions of this document that can be found on the Internet. Some do not accurately represent the original document: several glyphs are obviously missing. More subtly, lowercase Greek letters are rendered upright, not slanted as they appeared in the C/A/T's special font and as expected by troff users. describes an alternative set of special character glyph names, which extends and in some cases overrides the definitions listed above.


 

Index

Name
Description
Fundamental character set
Eight-bit encodings and Latin-1 supplement
Special character escape forms
Glyph tables
Basic Latin
Supplementary Latin letters
Ligatures and digraphs
Accents
Accented characters
Quotation marks
Punctuation
Brackets
Arrows
Rules and lines
Text markers
Legal symbols
Currency symbols
Units
Logical symbols
Mathematical symbols
Greek glyphs
Playing card symbols
History
Files
Authors
See also





Support us on Content Nation
rdf newsfeed | rss newsfeed | Atom newsfeed
- Powered by LeopardCMS - Running on Gentoo -
Copyright 2004-2025 Sascha Nitsch Unternehmensberatung GmbH
Valid XHTML1.1 : Valid CSS
- Level Triple-A Conformance to Web Content Accessibility Guidelines 1.0 -
- Copyright and legal notices -
Time to create this page: 11.2 ms