%refer
Section: User Commands (1)
Updated: 2 July 2023
Index
Return to Main Contents
Name
refer - process bibliographic references for
groff
.nr d-fallback 1
.nr d-fallback 1
Synopsis
[
-bCenPRS]
[
-a~n]
[
-B
field.macro]
[
-c~fields]
[
-f~n]
[
-i~fields]
[
-k~field]
[
-l~rang-expression]
[
-p~databas-file]
[
-s~fields]
[
-t~n]
[
file~...]
--help
-v
--version
Description
The GNU implementation of
refer is part of the
document formatting system.
refer
is a
preprocessor that prepares bibilographic citations by looking up
keywords specified in a
input document,
obviating the need to type such annotations,
and permitting the citation style in formatted output to be altered
independently and systematically.
It copies the contents of each
file
to the standard output stream,
except that it interprets lines between
.[
and
.]
as citations to be translated into
groff
input,
and lines between
.R1
and
.R2
as instructions regarding how citations are to be processed.
Normally,
refer
is not executed directly by the user,
but invoked by specifying the
-R
option to
If no
file
operands are given on the command line,
or if
file
is
[lq]
-[rq],
the standard input stream is read.
Each citation specifies a reference.
The citation can specify a reference that is contained in a
bibliographic database by giving a set of keywords that only that
reference contains.
Alternatively it can specify a reference by supplying a database record
in the citation.
A combination of these alternatives is also possible.
For each citation,
refer
can produce a mark in the text.
This mark consists of some label which can be separated from the text
and from other labels in various ways.
For each reference it also outputs
language commands that can be used by a macro package to produce a
formatted reference for each citation.
The output of
refer
must therefore be processed using a suitable macro package,
such as
me,
mm,
mom,
or
ms.
The commands to format a citation's reference can be output immediately
after the citation,
or the references may be accumulated,
and the commands output at some later point.
If the references are accumulated,
then multiple citations of the same reference will produce a single
formatted reference.
The interpretation of lines between
.R1
and
.R2
as prepreocessor commands is a feature of GNU
refer.
Documents making use of this feature can still be processed by AT&T
refer just by adding the lines
-
.de R1
.ig R2
..
to the beginning of the document.
This will cause
to ignore everything between
.R1
and
.R2.
The effect of some commands can also be achieved by options.
These options are supported mainly for compatibility with AT&T
refer.
It is usually more convenient to use commands.
refer
generates
.lf
requests so that file names and line numbers in messages produced by
commands that read
refer
output will be correct;
it also interprets lines beginning with
.lf
so that file names and line numbers in the messages and
.lf
lines that it produces will be accurate even if the input has been
preprocessed by a command such as
Bibliographic databases
The bibliographic database is a text file consisting of records
separated by one or more blank lines.
Within each record fields start with a
%
at the beginning of a line.
Each field has a one character name that immediately follows the
%.
It is best to use only upper and lower case letters for the names
of fields.
The name of the field should be followed by exactly one space,
and then by the contents of the field.
Empty fields are ignored.
The conventional meaning of each field is as follows:
- %A
-
The name of an author.
If the name contains a suffix such as [lq]Jr.[rq],
it should be separated from the last name by a comma.
There can be multiple occurrences of the
%A
field.
The order is significant.
It is a good idea always to supply an
%A
field or a
%Q
field.
- %B
-
For an article that is part of a book,
the title of the book.
- %C
-
The place (city) of publication.
- %D
-
The date of publication.
The year should be specified in full.
If the month is specified,
the name rather than the number of the month should be used,
but only the first three letters are required.
It is a good idea always to supply a
%D
field;
if the date is unknown,
a value such as
in press
or
unknown
can be used.
- %E
-
For an article that is part of a book,
the name of an editor of the book.
Where the work has editors and no authors,
the names of the editors should be given as
%A
fields and
[lq],~(ed.)[rq]
or
[lq],~(eds.)[rq]
should be appended to the last author.
- %G
-
U.S. government ordering number.
- %I
-
The publisher (issuer).
- %J
-
For an article in a journal,
the name of the journal.
- %K
-
Keywords to be used for searching.
- %L
-
Label.
- %N
-
Journal issue number.
- %O
-
Other information.
This is usually printed at the end of the reference.
- %P
-
Page number.
A range of pages can be specified as
m-n.
- %Q
-
The name of the author,
if the author is not a person.
This will only be used if there are no
%A
fields.
There can only be one
%Q
field.
- %R
-
Technical report number.
- %S
-
Series name.
- %T
-
Title.
For an article in a book or journal,
this should be the title of the article.
- %V
-
Volume number of the journal or book.
- %X
-
Annotation.
For all fields except
%A
and
%E,
if there is more than one occurrence of a particular field in a record,
only the last such field will be used.
If accent strings are used,
they should follow the character to be accented.
This means that an
ms
document must call the
.AM
macro when it initializes.
Accent strings should not be quoted:
use one
rather than two.
Accent strings are an obsolescent feature of the
me
and
ms
macro packages;
modern documents should use
groff
special character escape sequences instead;
see
Citations
Citations have a characteristic format.
-
.[openin-text
flags keywords
fields
.]closin-text
The
openin-text,
closin-text,
and
flags
components are optional.
Only one of the
keywords
and
fields
components need be specified.
The
keywords
component says to search the bibliographic databases for a reference
that contains all the words in
keywords.
It is an error if more than one reference is found.
The
fields
components specifies additional fields to replace or supplement those
specified in the reference.
When references are being accumulated and the
keywords
component is no-empty,
then additional fields should be specified only on the first occasion
that a particular reference is cited,
and will apply to all citations of that reference.
The
openin-text
and
closin-text
components specify strings to be used to bracket the label instead of
those in the
bracket-label
command.
If either of these components is no-empty,
the strings specified in the
bracket-label
command will not be used;
this behavior can be altered using the
[
and
]
flags.
Leading and trailing spaces are significant for these components.
The
flags
component is a list of no-alphanumeric characters each of which
modifies the treatment of this particular citation.
AT&T
refer will treat these flags as part of the keywords and so will ignore them
since they are no-alphanumeric.
The following flags are currently recognized.
- #
-
Use the label specified by the
short-label
command,
instead of that specified by the
label
command.
If no short label has been specified,
the normal label will be used.
Typically the short label is used with autho-date labels and consists
of only the date and possibly a disambiguating letter;
the
[lq]#[rq]
is supposed to be suggestive of a numeric type of label.
- [
-
Precede
openin-text
with the first string specified in the
bracket-label
command.
- ]
-
Follow
closin-text
with the second string specified in the
bracket-label
command.
An advantage of using the
[
and
]
flags rather than including the brackets in
openin-text
and
closin-text
is that
you can change the style of bracket used in the document just by
changing the
bracket-label
command.
Another is that sorting and merging of citations will not necessarily be
inhibited if the flags are used.
If a label is to be inserted into the text,
it will be attached to the line preceding the
.[
line.
If there is no such line,
then an extra line will be inserted before the
.[
line and a warning will be given.
There is no special notation for making a citation to multiple
references.
Just use a sequence of citations,
one for each reference.
Don't put anything between the citations.
The labels for all the citations will be attached to the line preceding
the first citation.
The labels may also be sorted or merged.
See the description of the
<>
label expression,
and of the
sort-adjacent-labels
and
abbreviate-label-ranges
commands.
A label will not be merged if its citation has a no-empty
openin-text
or
closin-text.
However,
the labels for a citation using the
]
flag and without any
closin-text
immediately followed by a citation using the
[
flag and without any
openin-text
may be sorted and merged
even though the first citation's
openin-text
or the second citation's
closin-text
is no-empty.
(If you wish to prevent this,
use the dummy character escape sequence
[rs]&
as the first citation's
closin-text.)
Commands
Commands are contained between lines starting with
.R1
and
.R2.
Recognition of these lines can be prevented by the
-R
option.
When a
.R1
line is recognized any accumulated references are flushed out.
Neither
.R1
nor
.R2
lines,
nor anything between them,
is output.
Commands are separated by newlines or semicolons.
A number sign
(
#)
introduces a comment that extends to the end of the line,
but does not conceal the newline.
Each command is broken up into words.
Words are separated by spaces or tabs.
A word that begins with a (neutral) double quote
(
[dq])
extends to the next double quote that is not followed by another double
quote.
If there is no such double quote,
the word extends to the end of the line.
Pairs of double quotes in a word beginning with a double quote collapse
to one double quote.
Neither a number sign nor a semicolon is recognized inside double
quotes.
A line can be continued by ending it with a backslash
[lq]
[rs][rq];
this works everywhere except after a number sign.
name
that is marked with *Each command has an associated negative command
no-name
that undoes the effect of
name.
For example,
the
no-sort
command specifies that references should not be sorted.
The negative commands take no arguments.
In the following description each argument must be a single word;
field
is used for a single upper or lower case letter naming a field;
fields
is used for a sequence of such letters;
m
and
n
are used for a no-negative numbers;
string
is used for an arbitrary string;
file
is used for the name of a file.
- abbreviate*Each command~fields~string1~string2~string3~string4
-
Abbreviate the first names of
fields.
An initial letter will be separated from another initial letter by
string1,
from the last name by
string2,
and from anything else
(such as [lq]von[rq] or [lq]de[rq])
by
string3.
These default to a period followed by a space.
In a hyphenated first name,
the initial of the first part of the name will be separated from the
hyphen by
string4;
this defaults to a period.
No attempt is made to handle any ambiguities that might
result from abbreviation.
Names are abbreviated before sorting and before label construction.
- abbreviate-label-ranges*Each command~string
-
Three or more adjacent labels that refer to consecutive references
will be abbreviated to a label consisting of the first label,
followed by
string,
followed by the last label.
This is mainly useful with numeric labels.
If
string
is omitted,
it defaults to
[lq]-[rq].
- accumulate*Each command
-
Accumulate references instead of writing out each reference
as it is encountered.
Accumulated references will be written out whenever a reference
of the form
-
-
.[
$LIST$
.]
is encountered,
after all input files have been processed,
and whenever a
.R1
line is recognized.
- annotate*Each command~field string
-
field
is an annotation;
print it at the end of the reference as a paragraph preceded by the line
-
-
.string
If
string
is omitted,
it will default to
AP;
if
field
is also omitted it will default to
X.
Only one field can be an annotation.
- articles~string~
-
...
Each
string
is a definite or indefinite article,
and should be ignored at the beginning of
T
fields when sorting.
Initially,
[lq]a[rq],
[lq]an[rq],
and
[lq]the[rq] are recognized as articles.
- bibliography~file~
-
...
Write out all the references contained in each bibliographic database
file.
This command should come last in an
.R1/.R2
block.
- bracket-label~string1 string2 string3
-
In the text,
bracket each label with
string1
and
string2.
An occurrence of
string2
immediately followed by
string1
will be turned into
string3.
The default behavior is as follows.
-
" RS twice to get inboard of the tagged paragraph indentation.
-
bracket-label *([. *(.] [dq], [dq]
- capitalize~fields
-
Convert
fields
to caps and small caps.
- compatible*Each command
-
Recognize
.R1
and
.R2
even when followed by a character other than space or newline.
- database~file~
-
...
Search each bibliographic database
file.
For each
file,
if an index
file.i
created by
exists,
then it will be searched instead;
each index can cover multiple databases.
- date-as-label*Each command~string
-
string
is a label expression that specifies a string with which to replace the
D
field after constructing the label.
See subsection [lq]Label expressions[rq] below for a description of
label expressions.
This command is useful if you do not want explicit labels in the
reference list,
but instead want to handle any necessary disambiguation by qualifying
the date in some way.
The label used in the text would typically be some combination of the
author and date.
In most cases you should also use the
no-label-in-reference
command.
For example,
-
" RS twice to get inboard of the tagged paragraph indentation.
-
date-as-label D.+yD.y%a*D.-y
would attach a disambiguating letter to the year part of the
D
field in the reference.
- default-database*Each command
-
The default database should be searched.
This is the default behavior,
so the negative version of this command is more useful.
refer
determines whether the default database should be searched
on the first occasion that it needs to do a search.
Thus a
no-default-database
command must be given before then,
in order to be effective.
- discard*Each command~fields
-
When the reference is read,
fields
should be discarded;
no string definitions for
fields
will be output.
Initially,
fields
are
XYZ.
- et-al*Each command~string m n
-
Control use of
et al.
in the evaluation of
@
expressions in label expressions.
If the number of authors needed to make the author sequence unambiguous
is
u
and the total number of authors is
t
then the last
t-u
authors will be replaced by
string
provided that
t-u
is not less than
m
and
t
is not less than
n.
The default behavior is as follows.
-
" RS twice to get inboard of the tagged paragraph indentation.
-
et-al [dq] et al[dq] 2 3
Note the absence of a dot from the end of the abbreviation,
which is arguably not correct.
(Et al[.]
is short for
et alli,
as
etc.
is short for
et cetera.)
- include~file
-
Include
file
and interpret the contents as commands.
- join-authors~string1 string2 string3
-
Join multiple authors together with
strings.
When there are exactly two authors,
they will be joined with
string1.
When there are more than two authors,
all but the last two will be joined with
string2,
and the last two authors will be joined with
string3.
If
string3
is omitted,
it will default to
string1;
if
string2
is also omitted it will also default to
string1.
For example,
-
-
join-authors [dq] and [dq] [dq], [dq] [dq], and [dq]
will restore the default method for joining authors.
- label-in-reference*Each command
-
When outputting the reference,
define the string
[F
to be the reference's label.
This is the default behavior,
so the negative version of this command is more useful.
- label-in-text*Each command
-
For each reference output a label in the text.
The label will be separated from the surrounding text as described in
the
bracket-label
command.
This is the default behavior,
so the negative version of this command is more useful.
- label~string
-
string
is a label expression describing how to label each reference.
- separate-label-second-parts~string
-
When merging tw-part labels,
separate the second part of the second label from the first label with
string.
See the description of the
<>
label expression.
- move-punctuation*Each command
-
In the text,
move any punctuation at the end of line past the label.
It is usually a good idea to give this command unless you are using
superscripted numbers as labels.
- reverse*Each command~string
-
Reverse the fields whose names
are in
string.
Each field name can be followed by a number which says how many such
fields should be reversed.
If no number is given for a field,
all such fields will be reversed.
- search-ignore*Each command~fields
-
While searching for keys in databases for which no index exists,
ignore the contents of
fields.
Initially,
fields
XYZ
are ignored.
- search-truncate*Each command~n
-
Only require the first
n
characters of keys to be given.
In effect when searching for a given key words in the database are
truncated to the maximum of
n
and the length of the key.
Initially,
n
is~6.
- short-label*Each command~string
-
string
is a label expression that specifies an alternative
(usually shorter)
style of label.
This is used when the
#
flag is given in the citation.
When using autho-date style labels,
the identity of the author or authors is sometimes clear from the
context,
and so it may be desirable to omit the author or authors from the label.
The
short-label
command will typically be used to specify a label containing just
a date and possibly a disambiguating letter.
- sort*Each command~string
-
Sort references according to
string.
References will automatically be accumulated.
string
should be a list of field names,
each followed by a number,
indicating how many fields with the name should be used for sorting.
[lq]+[rq]
can be used to indicate that all the fields with the name should be
used.
Also
.
can be used to indicate the references should be sorted using the
(tentative) label.
(Subsection [lq]Label expressions[rq] below describes the concept of a
tentative label.)
- sort-adjacent-labels*Each command
-
Sort labels that are adjacent in the text according to their position
in the reference list.
This command should usually be given if the
abbreviate-label-ranges
command has been given,
or if the label expression contains a
<>
expression.
This will have no effect unless references are being accumulated.
Label expressions
Label expressions can be evaluated both normally and tentatively.
The result of normal evaluation is used for output.
The result of tentative evaluation,
called the
tentative label,
is used to gather the information that normal evaluation needs to
disambiguate the label.
Label expressions specified by the
date-as-label
and
short-label
commands are not evaluated tentatively.
Normal and tentative evaluation are the same for all types of expression
other than
@,
*,
and
%
expressions.
The description below applies to normal evaluation,
except where otherwise specified.
- field
-
field~n
The
n-th
part of
field.
If
n
is omitted,
it defaults to~1.
- [aq]string[aq]
-
The characters in
string
literally.
- @
-
All the authors joined as specified by the
join-authors
command.
The whole of each author's name will be used.
However,
if the references are sorted by author
(that is,
the sort specification starts with
[lq]A+[rq]),
then authors' last names will be used instead,
provided that this does not introduce ambiguity,
and also an initial subsequence of the authors may be used instead of
all the authors,
again provided that this does not introduce ambiguity.
The use of only the last name for the
i-th
author of some reference
is considered to be ambiguous if
there is some other reference,
such that the first
i-1
authors of the references are the same,
the
i-th
authors are not the same,
but the
i-th
authors last names are the same.
A proper initial subsequence of the sequence of authors for some
reference is considered to be ambiguous if there is a reference with
some other sequence of authors which also has that subsequence as a
proper initial subsequence.
When an initial subsequence of authors is used,
the remaining authors are replaced by the string specified by the
et-al
command;
this command may also specify additional requirements that must be
met before an initial subsequence can be used.
@
tentatively evaluates to a canonical representation of the authors,
such that authors that compare equally for sorting purpose will have
the same representation.
- %n
-
%a
%A
%i
%I
The serial number of the reference formatted according to the
character following the
%.
The serial number of a reference is~1 plus the number of earlier
references with same tentative label as this reference.
These expressions tentatively evaluate to an empty string.
- expr*
-
If there is another reference with the same tentative label as this
reference,
then
expr,
otherwise an empty string.
It tentatively evaluates to an empty string.
- expr+n
-
expr-n
The first
(+)
or last
(-)
n
upper or lower case letters or digits of
expr.
roff
special characters
(such as
([aq]a)
count as a single letter.
Accent strings are retained but do not count towards the total.
- expr.l
-
expr
converted to lowercase.
- expr.u
-
expr
converted to uppercase.
- expr.c
-
expr
converted to caps and small caps.
- expr.r
-
expr
reversed so that the last name is first.
- expr.a
-
expr
with first names abbreviated.
Fields specified in the
abbreviate
command are abbreviated before any labels are evaluated.
Thus
.a
is useful only when you want a field to be abbreviated in a label
but not in a reference.
- expr.y
-
The year part of
expr.
- expr.+y
-
The part of
expr
before the year,
or the whole of
expr
if it does not contain a year.
- expr.-y
-
The part of
expr
after the year,
or an empty string if
expr
does not contain a year.
- expr.n
-
The last name part of
expr.
- expr1[ti]expr2
-
expr1
except that if the last character of
expr1
is
-
then it will be replaced by
expr2.
- expr1 expr2
-
The concatenation of
expr1
and
expr2.
- expr1|expr2
-
If
expr1
is no-empty then
expr1
otherwise
expr2.
- expr1&expr2
-
If
expr1
is no-empty
then
expr2
otherwise an empty string.
- expr1?expr2:expr3
-
If
expr1
is no-empty
then
expr2
otherwise
expr3.
- <expr>
-
The label is in two parts,
which are separated by
expr.
Two adjacent tw-part labels which have the same first part will be
merged by appending the second part of the second label onto the first
label separated by the string specified in the
separate-label-second-parts
command
(initially,
a comma followed by a space);
the resulting label will also be a tw-part label with the same first
part as before merging,
and so additional labels can be merged into it.
It is permissible for the first part to be empty;
this may be desirable for expressions used in the
short-label
command.
- (expr)
-
The same as
expr.
Used for grouping.
The above expressions are listed in order of precedence
(highest first);
&
and
|
have the same precedence.
Macro interface
Each reference starts with a call to the macro
]-.
The string
[F
will be defined to be the label for this reference,
unless the
no-label-in-reference
command has been given.
There then follows a series of string definitions,
one for each field:
string
[X
corresponds to field
X.
The register
[P
is set to~1 if the
P
field contains a range of pages.
The
[T,
[A
and
[O
registers are set to~1 according as the
T,
A
and
O
fields end with any of
.?!
(an en-o-sentence character).
The
[E
register will be set to~1 if the
[E
string contains more than one name.
The reference is followed by a call to the
][
macro.
The first argument to this macro gives a number representing
the type of the reference.
If a reference contains a
J
field,
it will be classified as type~1,
otherwise if it contains a
B
field,
it will be type~3,
otherwise if it contains a
G
or
R
field it will be type~4,
otherwise if it contains an
I
field it will be type~2,
otherwise it will be type~0.
The second argument is a symbolic name for the type:
other,
journal-article,
book,
article-in-book,
or
tech-report.
Groups of references that have been accumulated or are produced by the
bibliography
command are preceded by a call to the
]<
macro and followed by a call to the
]>
macro.
Options
--help
displays a usage message,
while
-v
and
--version
show version information;
all exit afterward.
- -R
-
Don't recognize lines beginning with
.R1/.R2.
Other options are equivalent to
refer
commands.
- -a~n
-
reverse
An
- -b
-
no-label-in-text; no-label-in-reference
- -B
-
See below.
- -c~fields
-
capitalize
fields
- -C
-
compatible
- -e
-
accumulate
- -f~n
-
label
%n
- -i~fields
-
search-ignore
fields
- -k
-
label
L[ti]%a
- -k~field
-
label
field[ti]%a
- -l
-
label
A.nD.y%a
- -l~m
-
label
A.n+mD.y%a
- -l~,n
-
label
A.nD.y-n%a
- -l~m,n
-
label
A.n+mD.y-n%a
- -n
-
no-default-database
- -p~d-file
-
database
d-file
- -P
-
move-punctuation
- -s~spec
-
sort
spec
- -S
-
label [dq](A.n|Q) [aq], [aq] (D.y|D)[dq]; bracke-label [dq]~([dq]~)~[dq];~[dq]
- -t~n
-
search-truncate
n
The
B
option has command equivalents with the addition that the file names
specified on the command line are processed as if they were arguments to
the
bibliography
command instead of in the normal way.
- -B
-
annotate X AP; no-label-in-reference
- -B~field.macro
-
annotate
field
macro;
no-label-in-reference
Environment
- REFER
-
If set,
overrides the default database.
Files
- /usr/:dict/:papers/:Ind
-
Default database.
- file.i
-
Index files.
- /usr/:share/:groff/:1.23.0/:tmac/:refer:.tmac
-
defines macros and strings facilitating integration with macro packages
that wish to support
refer.
refer
uses temporary files.
See the
man page for details of where such files are created.
Bugs
In label expressions,
<>
expressions are ignored inside
.char
expressions.
Examples
We can illustrate the operation of
refer
with a sample bibliographic database containing one entry and a simple
roff
document to cite that entry.
-
$ cat > my-db-file
%A Daniel P.[rs]& Friedman
%A Matthias Felleisen
%C Cambridge, Massachusetts
%D 1996
%I The MIT Press
%T The Little Schemer, Fourth Edition
$ refer-p my-db-file
Read the book
.[
friedman
.]
on your summer vacation.
<Control+D>
.lf 1 -
Read the book[rs]*([.1[rs]*(.]
.ds [F 1
.]-
.ds [A Daniel P. Friedman and Matthias Felleisen
.ds [C Cambridge, Massachusetts
.ds [D 1996
.ds [I The MIT Press
.ds [T The Little Schemer, Fourth Edition
.nr [T 0
.nr [A 0
.][ 2 book
.lf 5 -
on your summer vacation.
The foregoing shows us that
refer
(a) produces a label [lq]1[rq];
(b) brackets that label with interpolations of the
[lq]
[.[rq]
and
[lq]
.][rq]
strings;
(c) calls a macro
[lq]
]-[rq];
(d) defines strings and registers containing the label and bibliographic
data for the reference;
(e) calls a macro
[lq]
][[rq];
and (f) uses the
lf
request to restore the line numbers of the original input.
As discussed in subsection [lq]Macro interface[rq] above,
it is up to the document or a macro package to employ and format this
information usefully.
Let us see how we might turn
to this task.
-
$ REFER=my-db-file groff -R -ms
.LP
Read the book
.[
friedman
.]
on your summer vacation.
Commentary is available.[rs]*{*[rs]*}
.FS [rs]*{*[rs]*}
Space reserved for penetrating insight.
.FE
ms's
automatic footnote numbering mechanism is not aware of
refer's
label numbering,
so we have manually specified a (superscripted) symbolic footnote for
our no-bibliographic aside.
See also
[lq]Some Applications of Inverted Indexes on the Unix System[rq],
by M. E. Lesk,
1978,
AT&T Bell Laboratories Computing Science Technical Report No. 69.
Index
- Name
-
- Synopsis
-
- Description
-
- Bibliographic databases
-
- Citations
-
- Commands
-
- Label expressions
-
- Macro interface
-
- Options
-
- Environment
-
- Files
-
- Bugs
-
- Examples
-
- See also
-