Line breaking in LaTeX

I guess every LaTeX user has found herself in a situation when LaTeX would refuse to split a word in a way the user knows correct (but LaTeX doesn’t), and it has caused some kind of trouble in the formatting of the paragraph (or not). It is worth mentioning that LaTeX has a very good knowledge of how to hyphenate English words, and anyway will almost never split a word incorrectly. When in doubt, it won’t split it at all.

When such a situation arises, the user has (as usual with LaTeX) more than one way to fix it. We can tell LaTeX explicitly where it can hyphenate the word. Using an example from Leslie Lamport’s 1985 “LaTeX: User’s Guide & Reference Manual”:

LaTeX does not know how to hyphenate gnomonly. We can write the word like this: gno\-mon\-ly, and then it will know that the word can be splitted at the places where \- appears (the backslash is not an error, it is required).

However, it is an ad hoc solution. We might find ourselves writing gnomonly quite often (well, not really that often), and having to write gno\-mon\-ly all the times gets old after a while (for me, the second time). To avoid this MS Word-ish solution, we can add the following statement in our preamble: \hyphenation{gno-mon-ly gno-mon}. From that moment on, LaTeX will know how to hyphenate gnomon and gnomonly. However, it will still be unable to hyphenate gnomonic (for that, you would have to add that word to a \hyphenation statement, too).

A related problem can happen when long words or expressions appear at the end of a line. It might be impossible to split the word in such a way that the line containing it is not longer than all the other lines in the same paragraph. In such a situation, LaTeX issues a warning of overfull \hbox. To understand the problem, let’s see how LaTeX manages the linebreaks.

The user (through the page settings and style, and all the stuff in the preamble etc.) tells LaTeX six things about lines in paragraphs: desired line width (page width less margins, if it’s a one-column text), desired inter-word space, and minimum and maximum acceptable values for both. When LaTeX writes a text, it does the following:

  1. Choose an interger number of words, so that they are equally spaced by the desired inter-word space, and make the line have the desired line width.
  2. If it is not possible, it tries the least obstrusive fix that gives the best result, from the following:
    • Increase/decrese the inter-word space (within the acceptable limits) untill an integer number of words makes a line of desired width.
    • Split the last word in the line, so that the non-integer number of words makes a line of desired width.
  3. If none of the above yields a perfect line-width, test if it is within the acceptable limits.
  4. If the width of the line is not acceptable, print it anyway, and cast a warning (overfull \hline if it was too wide, underfull \hline if it was too thin, whatever is less incorrect).

Usually ill-sized lines are very ugly on the eye, even for small deviations, and so it would be interesting to fix these errors. It is important to understand that LaTeX’s standards on what inter-word space range is acceptable (and what line width range), are quite strict, and it prefers to stick to them and produce a line that is too wide, giving a warning in the output. Usually this is sensible, but often times we would rather override its standards, and make the freaking line fit in the fixed-width paragraph.

To do so, we can enclose the paragraph between \begin{sloppypar} and \end{sloppypar}. For example:

\begin{sloppypar}
This text that I am writing is in fact astonishingly and utterly incommensurably acojonantemente chungo to fit correctly in a line.
\end{sloppypar}

The sloppypar environment is such that the text within it has a much wider acceptable inter-word space range. This gives it a bigger flexibility in the point 2 above, so that we’ll hardly ever fall to point 3.

Leave a Comment