A program of a DP (dynamic programming) for pretty printing
The problem.
Input:
A text document (English), the number of characters in a line, a cost
function for redundant spaces.
Output:
A layout of the given document, right-justified, such that the total cost
is minimum.
Note: The output should be viewed with a monospaced font (i.e., fixed pitch
or non-proportional font).
An input example ----------------------------------------------------------
Metaheuristic algorithms are widely recognized as one of the most
practical approaches for combinatorial optimization problems.
Among representative metaheuristics are genetic algorithm,
simulated annealing, tabu search and so on. In this paper, we
explain essential ideas used in such metaheuristic algorithms
within a generalized framework of local search. We then conduct
numerical experiment of metaheuristic algorithms using rather
simple implementations, to observe general tendencies of their
performance. From these results, we propose a few recommendations
about the use of metaheuristics as simple optimization tools. We
also mention some advanced techniques to enhance the ability of
metaheuristics. Finally, we summarize some theoretical results on
metaheuristic algorithms.
The number of characters in a line: 75
Cost for redundant space: When k redundant spaces are placed after a word,
a cost of k^5 is added (one space is not redundant, so k=0 in such a case).
An output example ---------------------------------------------------------
Metaheuristic algorithms are widely recognized as one of the most practical
approaches for combinatorial optimization problems. Among representative
metaheuristics are genetic algorithm, simulated annealing, tabu search
and so on. In this paper, we explain essential ideas used in such
metaheuristic algorithms within a generalized framework of local search. We
then conduct numerical experiment of metaheuristic algorithms using rather
simple implementations, to observe general tendencies of their performance.
From these results, we propose a few recommendations about the use of
metaheuristics as simple optimization tools. We also mention some advanced
techniques to enhance the ability of metaheuristics. Finally, we summarize
some theoretical results on metaheuristic algorithms.
Total cost: 28
-----------------------------------------------------------------------------
Please see dp_eng.pdf to understand the idea of the DP algorithm.
Usage.
To compile, just type "make", or you can type:
gcc -Wall -O2 -o typeset typeset.c -lm
(necessary files for this: typeset.c, cpu_time.c, makefile).
Input a document from stdin, then the result is output to stdout.
The number of characters can be specified as a parameter.
If the document has two paragraphs or more, please put an empty line
between two paragraphs.
For example, if you have a document whose file name is "sample.txt", you
can get a layout by
cat sample.txt | ./typeset
If you would like to specify some parameters, type for example
cat sample.txt | ./typeset (parameter name) (parameter value) ...
cat sample.txt | ./typeset lmin 65
cat sample.txt | ./typeset lmin 60 lmax 70 outlvl 1
Parameters you can specify:
---------------------------------------------------------------------------
parameter default explanation
name value
--------- ------- -------------------------------------------
lmax 80 max number of characters in a line
(if it's less than lmin -> modified to lmin)
lmin 70 min number of characters in a line
(if it's less than min word length
-> set to min word length)
ind1 0 indent (#spaces) of the 1st paragraph
ind2 0 indent of 2nd paragraph and later
lines 1 number of empty lines between paragraphs
ctype 2 cost type ("exp" is a parameter)
1: (#spaces in a line)^{exp}.
if this cost is specified, the layout is
not right-justified.
2: When k redundant spaces are placed after
a word, a cost of k^{exp} is added (one
space is considered k=0).
exp 5 The exponent of cost function
pslide 1 1: '.' and ',' at the end of a line is
placed one-character outside of
the text (available only if ctype=2).
0: a punctuation is considered as a part of
a word.
outlvl 0 Output level.
0: Text only.
1: Text, adopted text width, optimal cost,
and computation time.
2: In addition, optimal cost for each value
of the number of characters in a line
between lmin and lmax.
pslide = 1 was prepared because when "." or "," is at the end a line, I
felt such a line seems like a dent, and I felt it nicer to have them
one-character to the right. This will depend on the font set, so if you do
not like it, please set pslide = 0.
Usage on emacs:
You can use it from emacs. Specify a region (from the place marked by C-spc
to the current place of the cursor).
Then type "M-|" (or "M-x shell-command-on-region") and "typeset [option(s)]".
If you have comments (e.g., bugs), please contact me via e-mail
(yagiura "at" nagoya-u.jp).