Colophon

Posted by W. Caleb McDaniel

Tools I Use

How This Site is Built

The posts and pages on this site begin as plain-text files written in Pandoc’s extended version of Markdown. I then use the bash shell script below to turn those files into flat HTML documents that are uploaded to my server.

The shell script is a much more basic version of full-featured static site generators like Hakyll, Jekyll, and Hyde, and even more like rawk. I looked at some of these programs but wanted to see if I could build something lighter for myself that used Unix tools I was already familiar with. Pandoc is robust enough, for my purposes, to do most of the heavy lifting with a simple pandoc HTML template, which I’ve posted for reference here. For now, at least, this script also manages to conform to the Hakyll philosophy; not only does configuration take less than 100 lines of code, but the whole shell script is under 100 lines. I took some inspiration for it from this page and a few others.

The key part of the script is the long echo command that comes after using pandoc to convert each post to html. This line creates a record for each post containing information fields (separated by %) that are then manipulated later in the script by awk to generate an RSS feed and lists of posts for the main page and each category page.

To style the site, I use a customized, minimal version of Bootstrap for the responsive layout, and Glyphicons for the social media icons. Some colors were inspired by Simon Pascal Klein and Ethan Schoonover, though neither should be blamed for my amateurish design choices!

I’ve set up the script to update the code below everytime I upload changes to this site. I have also included comments in the html source for individual pages so that interested geeks can see which parts of the site are added using pandoc’s options and which are part of the Markdown files that form the main content. I’m still a shell scripting newbie, so if you see problems with the code or have suggestions about improving it, I’d be grateful if you’d let me know at . The script and all of the files that make up the website can also be found in a github repository, though it may not always be as current as this site.

Code used to generate site on Mon Apr 14 09:18:08 CDT 2014

#!/bin/sh

LOCDIR=$HOME/Dropbox/website # Run script from this directory
PUBDIR=$HOME/publish
FOOTER=$LOCDIR/_footer.html
NAVBAR=$LOCDIR/_navigation.html
PANOPTS="--smart --standalone -f markdown --template=website.html\
 --css=./bootstrap.css --css=./main.css --include-before-body=$NAVBAR"

# $PANOPTS above assume that the website template is in
# $HOME/.pandoc/templates/ and that the CSS file is in $PUBDIR.
# Next block assumes posts to be published are ...
# 1. In folders by category in $LOCDIR.
# 2. In markdown files with *.txt extension.
# 3. Contain a standard pandoc title block in first three lines.

> $LOCDIR/.allposts
echo "Processing posts ..."
find `ls -l $LOCDIR | awk '/^d/ {print $NF}'` -type d -maxdepth 1 | \
while read -r folder
do
CATEGORY=$(basename "$folder")
for file in `ls "$folder"/*.txt`
do
POST=$(basename "$file" .txt)
    if head -n 1 "$file" | grep -Eq "^%"; then
    TITLE=$(sed -n '1 s/% //p' "$file")
    POSTDATE=$(sed -n '3 s/% //p' "$file" | sed 's/[ ]$//')
    # Next two lines use BSD date command. For GNU date, use commented line
    # Thanks to @fravashi http://github.com/wcaleb/website/issues/1
    SORTDATE=$(date -jf '%B %e, %Y' "$POSTDATE" +%y%m%d)
    # SORTDATE=$(date -d "$POSTDATE" +%y%m%d)
    RSSDATE=$(date -jf '%B %e, %Y' "$POSTDATE" '+%a, %d %b %Y 00:00:00 %Z')
    # RSSDATE=$(date -d "$POSTDATE" '+%a, %d %b %Y 00:00:00 %Z')
    if [ $file -nt $PUBDIR/$POST.html ]; then
        echo "| $POST"
        pandoc $PANOPTS\
         --variable=category:"$CATEGORY"\
         --include-after-body="$FOOTER"\
         --output=$PUBDIR/"$POST".html\
         "$file"
    fi
    CLIP=$(grep -m 1 -Eo '<p>.+</p>' $PUBDIR/"$POST".html) 
    echo ""$SORTDATE"%"$TITLE"%"$POST".html%"$POSTDATE"%"$RSSDATE"%"$CLIP""\
     >> $LOCDIR/"$CATEGORY".txt
    fi
done
cat $LOCDIR/"$CATEGORY".txt >> $LOCDIR/.allposts
sort -nr $LOCDIR/"$CATEGORY".txt |\
 awk 'BEGIN{FS="%"};{print "* [" $2 "](" $3 ") | " $4 }'\
 > $LOCDIR/.postlist
pandoc $PANOPTS\
 -A "$FOOTER"\
 --output=$PUBDIR/"$CATEGORY".html\
 $LOCDIR/"$CATEGORY".pdc .postlist
rm $LOCDIR/"$CATEGORY".txt
done

echo "Processing index ..."
sort -nr $LOCDIR/.allposts | sed -n '1,5 p'|\
 awk 'BEGIN{FS="%"};{print "* [" $2 "](" $3 ") | " $4 }'\
 > $LOCDIR/recentposts.pdc 
pandoc $PANOPTS\
 -A "$FOOTER"\
 -o $PUBDIR/index.html\
 $LOCDIR/index.pdc $LOCDIR/recentposts.pdc

if [ $LOCDIR/cv.pdc -nt $PUBDIR/cv.html ];then
echo "Processing CV ..."
pandoc $PANOPTS\
 -A "$FOOTER"\
 -o $PUBDIR/cv.html\
 $LOCDIR/cvhead.pdc $LOCDIR/cv.pdc
sed -E 's/^[^#\[\\]/\\\ind &/g' $LOCDIR/cv.pdc |\
 pandoc -s -S -f markdown --latex-engine=xelatex\
 --template=cv.tex\
 -o $PUBDIR/mcdanielcv.pdf
fi

echo "Processing colophon ..."
cat $LOCDIR/$0 |\
 awk '
 BEGIN { print "Code used to generate site on"; system("date");
 print "\n`````bash" }
 { print }
 END { print "\n`````" }' > $LOCDIR/.script
pandoc $PANOPTS\
 -A "$FOOTER"\
 -o $PUBDIR/colophon.html\
 $LOCDIR/colophon.pdc $LOCDIR/.script
rm $LOCDIR/.script

echo "Processing RSS feed ..."
cp $LOCDIR/_feed.xml $PUBDIR/feed.xml
sort -nr $LOCDIR/.allposts | sed -n '1,5 p'|\
 awk 'BEGIN{FS="%"}
 {print "\t<item>"}
 {print "\t\t<title>" $2 "</title>"}
 {print "\t\t<link>http://wcm1.web.rice.edu/" $3 "</link>"}
 {print "\t\t<guid>http://wcm1.web.rice.edu/" $3 "</guid>"}
 {print "\t\t<pubDate>" $5 "</pubDate>"}
 {print "\t\t<description>" $6 "[...]</description>\n\t</item>"}
 END{print "</channel>\n</rss>"}'\
 >> $PUBDIR/feed.xml

exit 0