Pretty-Printing an S-Expression in Go

My friend Mike and I seem to implement Forth-like and Lisp-like mini-languages continuously.  While I used to implement small interpreters for these languages in C, I found that languages with built-in support for regular-expressions allowed me to quickly write lexical scanners for each.

I modified a set of regex’s that I had used in a Forth-like interpreter in Go ( see: https://lawlessguy.wordpress.com/2013/07/20/an-rpn-interpreter-in-go-golang/ ) so that I could parse parentheses, quoted strings, double-quoted strings, numbers (integers), and what I refer to as symbols.  Here’s the pretty-printer, sep.go, that I wrote ( “pretty” means that it indents when it encounters a left-parenthesis and removes indentation after it encounters a right-parenthesis … ) :

// Copyright 2013 - by Jim Lawless
// License: MIT / X11
// See: http://www.mailsend-online.com/license2013.php
//
// Bear with me ... I'm a Go noob.
//
// Parse an S-Expression using regexes in Go

package main

import (
	"fmt"
	"io/ioutil"
	"log"
	"os"
	"regexp"
)

var patDquote = "[\"][^\"]*[\"]"
var patSquote = "['][^']*[']"
var patNumber = "[-]?\\d+"
var patParen = "[()]"
var patSymbol = "\\S+"

var reDquote, reSquote *regexp.Regexp
var reNumber, reParen *regexp.Regexp
var reSymbol, reAll *regexp.Regexp

func indent(characters int) {
	for ; characters > 0; characters-- {
		fmt.Print(" ")
	}
}

func parse() {
	var s string
	var indent_ct int = 0

	b, err := ioutil.ReadAll(os.Stdin)
	if err != nil {
		log.Fatal(err)
	}
	input := string(b)
	tokens := reAll.FindAllString(input, -1)
	for i := 0; i < len(tokens); i++ {
		s = tokens[i]
		indent(indent_ct)
		switch {
		case reDquote.MatchString(s):
			fmt.Printf("Double-Quote String: %s\n", s)
		case reSquote.MatchString(s):
			fmt.Printf("Single-Quote String: %s\n", s)
		case reNumber.MatchString(s):
			fmt.Printf("Number: %s\n", s)
		case reParen.MatchString(s):
			fmt.Printf("Parenthesis: %s\n", s)
			if s == "(" {
				indent_ct += 2
			}
			if s == ")" {
				indent_ct -= 2
			}
		case reSymbol.MatchString(s):
			fmt.Printf("Symbol: %s\n", s)
		default:
			fmt.Printf("Unknown token: %s\n", s)
		}
	}
}

func main() {
	reDquote, _ = regexp.Compile(patDquote)
	reSquote, _ = regexp.Compile(patSquote)
	reNumber, _ = regexp.Compile(patNumber)
	reParen, _ = regexp.Compile(patParen)
	reSymbol, _ = regexp.Compile(patSymbol)

	reAll, _ = regexp.Compile(patParen + "|" + patDquote + "|" + patSquote + "|" +
		patNumber + "|" + patSymbol)

	parse()
}

Let’s assume we have a file named “tmp.txt” which contains the following:

( this is "a test" )

(add (multiply 4 5) 6)

When you redirect tmp.txt’s contents into the sep program, here’s what you should see:

sep < tmp.txt

Parenthesis: (
  Symbol: this
  Symbol: is
  Double-Quote String: "a test"
  Parenthesis: )
Parenthesis: (
  Symbol: add
  Parenthesis: (
    Symbol: multiply
    Number: 4
    Number: 5
    Parenthesis: )
  Number: 6
  Parenthesis: )
Advertisements

About Jim Lawless

I've been programming computers for about 36 years ... 30 of that professionally. I've been a teacher, I've worked as a consultant, and have written articles here and there for publications like Dr. Dobbs Journal, The C/C++ Users Journal, Nuts and Volts, and others.
This entry was posted in Programming and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s