Friday, March 28, 2008

Using Pygments with less

The de-facto standard UNIX pager less supports an environment variable called LESSOPEN that can be set to the name of an input preprocessor. It is normally used to transparently view compressed files etc., but of course it can also colorize your source files using Pygments!

If you use Gentoo Linux, most of the work needed to set this up has already been done for you -- you just need to set

export LESSOPEN="|lesspipe.sh %s"
export LESSCOLORIZER=pygmentize

and make sure your LESS variable contains -r or -R so that the raw ANSI color codes are passed through by less. Gentoo's lesspipe.sh script will then automatically call Pygments for source code files.

On other platforms, you can set up a lesspipe.sh script yourself; it should look roughly like this:
#!/bin/sh
case "$1" in
# add all extensions you want to handle here
*.awk|*.groff|*.java|*.js|*.m4|*.php|*.pl|*.pm|*.pod|*.sh|\
*.ad[asb]|*.asm|*.inc|*.[ch]|*.[ch]pp|*.[ch]xx|*.cc|*.hh|\
*.lsp|*.l|*.pas|*.p|*.xml|*.xps|*.xsl|*.axp|*.ppd|*.pov|\
*.diff|*.patch|*.py|*.rb|*.sql|*.ebuild|*.eclass)
pygmentize "$1" ;;
*) exit 0;;
esac

Then export LESSOPEN="|lesspipe.sh %s" and enjoy colored viewing!

5 Comments:

TK said...

Excellent trick. :) It works a treat; thanks!

Deuce868 said...

That is very sweet. Thanks for the tip. I wonder if there is a way to get it to autodetect the format like vim or something.

Peter said...

At least on Debian based distros, less comes with the lesspipe and lessfile commands to provide preprocessing features for less. On these systems, your script can be used to extend these capabilities by placing it into the executable file ~/.lessfilter

Note that in ~/.lessfilter the unrecognized case must be exit 1;; For the exact setup details see: man lesspipe

It works great with this setup on Ubuntu. Thank you for the tip.

Reuben said...

Here's an expected .lessfilter script that falls back to using (recent) file to get the MIME type, so works for files without an extension. The number of languages currently recognised that way is small, but being a maintainer of file, now I've found a good use I'll add MIME types to file for all the languages it recognises.

#!/bin/sh
# .lessfilter to use pygmentize

case "$1" in
# add all extensions you want to handle here
*.awk|*.groff|*.java|*.js|*.m4|*.php|*.pl|*.pm|*.pod|*.sh|\
*.ad[asb]|*.asm|*.inc|*.[ch]|*.[ch]pp|*.[ch]xx|*.cc|*.hh|\
*.lsp|*.l|*.pas|*.p|*.xml|*.xps|*.xsl|*.axp|*.ppd|*.pov|\
*.diff|*.patch|*.py|*.rb|*.sql|*.ebuild|*.eclass)
pygmentize "$1"; exit 0;;
esac

case `file --mime-type --brief --dereference --uncompress $1` in
# add all MIME types you want to handle here
text/x-c|text/x-c++|text/x-makefile|text/x-pl1|text/x-asm|\
text/x-pascal|text/x-java|text/x-bcpl|text/x-m4|text/x-po)
pygmentize "$1"; exit 0;;
esac

exit 1;

Reuben said...

Here's an expanded version of my file-using filter that works better, as it manually sets the lexer based on the MIME type.

#!/bin/sh
# .lessfilter to use pygmentize

case "$1" in
# add all extensions you want to handle here
*.awk|*.groff|*.java|*.js|*.m4|*.php|*.pl|*.pm|*.pod|*.sh|\
*.ad[asb]|*.asm|*.inc|*.[ch]|*.[ch]pp|*.[ch]xx|*.cc|*.hh|\
*.lsp|*.l|*.pas|*.p|*.xml|*.xps|*.xsl|*.axp|*.ppd|*.pov|\
*.diff|*.patch|*.py|*.rb|*.sql|*.ebuild|*.eclass)
pygmentize "$1"; exit 0;;
esac

case `file --mime-type --brief --dereference --uncompress "$1"` in
# add all MIME types you want to handle here
text/troff) lexer=nroff;;
text/html) lexer=html;;
application/xml|image/svg+xml) lexer=xml;;
text/x-c) lexer=c;;
text/x-c++) lexer=cpp;;
text/x-makefile) lexer=make;;
text/x-pascal) lexer=pascal;;
text/x-java) lexer=java;;
text/x-po) lexer=po;;
text/x-lua) lexer=lua;;
text/x-python) lexer=python;;
text/x-perl) lexer=perl;;
text/x-shellscript) lexer=sh;;
text/x-msdos-batch) lexer=bat;;
text/x-diff) lexer=diff;;
text/x-tex) lexer=latex;;
# Types that pygmentize didn't support at time of writing
#text/x-gawk, text/x-nawk, text/x-awk, text/x-asm, text/x-bcpl,
#text/x-m4, text/x-pl1
esac

if [ -n "$lexer" ]; then
pygmentize -l $lexer "$1"
exit 0
fi

exit 1