Friday, July 9, 2021

Why Haskell became my favorite scripting language

I started learning Haskell a couple of years ago. Usually when I learn a programming languages past the “Let’s see what it does different” stage, I do so because there is a concrete use-case - a project I want to contribute to. With Haskell it was different. My primary motivation to learn Haskell wasn’t a concrete use-case, but instead I had the impression studying the language would be worth it just for the sake of learning. It does many things different than most other mainstream languages and I wanted to become familiar with concepts I hadn’t seen before. (Spoiler: It was worth it)

Writing a couple of iterations of “Hello World” applications wasn’t going to get me far, and utilizing it in a project at work also wasn’t an option, so I started using it for my ad-hoc scripts or utility scripts. Previously Python was my go-to language for such things, but Haskell ended up replacing it. In this article I will go through some of the reasons.

A compiled language?

Haskell is known as a compiled language, so you may be wondering how on earth you’d use it as a scripting language. The answer is that Haskell code can also be interpreted. The ghc package - the package for the de-facto standard compiler for Haskell - contains the runhaskell program which uses this interpreter mode. You can use it to execute Haskell files without intermediate compile step.

If you put the following content in a file called foo.hs you can execute it directly using ./foo.hs:

#!/usr/bin/env runhaskell

main :: IO ()
main = putStrLn "Hello World"

Interpreted mode is slower than compiling the code to an executable and running that, but for most scripting tasks it is fast enough.

Stack Script

runhaskell is nice, but if you do anything more involved than Hello World you end up wanting to re-use existing libraries. This is where stack script comes in. stack is a build tool for Haskell and it provides the stack script command, a more powerful alternative to runhaskell.

Let’s look at an example:

#!/usr/bin/env stack
{- stack script --resolver lts-17.9
  --package process
-}

main :: IO ()
main = putStrLn "Hello World"

The first line is the shebang that makes sure the script is executed using stack. The second line tells stack that it should use the stack script command. The --resolver flag needs some further elaboration:

To build a program using stack it needs to know which versions of the dependencies it should use. stack uses package sets to aid with the decisions. A package set is a pre-defined set of packages known to work together. The --resolver flag is used to choose the package set you want to use - including the GHC version. You can think of using the --resolver option as if nailing the versions of all the dependencies, with the assurance that they work together. If you want to learn more about that, take a look at Stackage.

The third line lists a dependency the script is going to use. You can add any number of dependencies this way:

#!/usr/bin/env stack
{- stack script --resolver lts-17.9
  --package process
  --package directory
  --package network-uri
-}

Another feature of stack script is the --optimize flag: This will cause the script to be compiled whenever it changes. This will cause a slow down the first time a script gets executed, but subsequent calls will be much faster because it will use the cached, compiled version.

(In the JVM world there is jbang which does something similar. Overall I think this approach isn’t as widespread as it deserves to be and I once prototyped pipex to see how this could work in the Python world. (Don’t use that, it was just an experiment))

If it compiles, it works

Or rather, if it type checks, it works. In many statically typed versus dynamically typed arguments, the dynamic typing camp argues that the static type checks don’t offer a lot because you need to write functional tests anyway. I think the argument has some flaws, mostly because one point of type checks is to let you get away with writing fewer tests. The types are supposed to make it impossible to create illegal states. If you accomplish that, you don’t need to write unit tests to verify error cases - the type checker forbids them. Furthermore, who writes tests for small utility scripts?

Although the “If it compiles, it works” mantra shouldn’t be taken literally - there are plenty of bugs it won’t catch - there is no denying that it is a feeling you get occasionally when writing Haskell.

No ceremony

This ties into the previous two points. There is no tedious ceremony to bootstrap a project to get dependency management. The Haskell syntax is very succinct, Haskell has incredible type inference, together with the superb algebraic data type support it encourages using types even for ad-hoc scripts. In a language like Python you’d instead use dictionaries for everything - which may seem convenient but could also mask bugs.

For example, I have a script that gets a list of emojis from the JSON file of the gemoji project and pipes them into bemenu:

bemenu demo

The first version of this script was written in Python. It was pretty simple and worked, but it had a bug that I didn’t notice until the rewrite to Haskell.

In Haskell I defined a type, and derived a JSON de-serialization implementation:

data Emoji = Emoji
  { emoji :: String
  , description :: String }
  deriving (Show, Generic)

instance FromJSON Emoji

You can think of Emoji as a struct with two fields, one for the emoji itself and one for its description.

Then I parsed the records:

emojis = eitherDecode emojiFileContents :: Either String Emoji

This failed with an error message that told me some entries are missing the description property. I had made an assumption about the data that turned out to be wrong. Python let me get away with this assumption but Haskell didn’t.

I decided that I don’t want emojis without description to show up, and changed the code to filter them out:

emojis = catMaybes <$> (eitherDecode emojiFileContents :: Either String [Maybe Emoji])

Don’t worry if you’re not familiar with Haskell and find it difficult to understand these code snippets. My point is that types can help you find mistakes, and Haskell makes it easy to utilize types without introducing a lot of additional ceremony.

Parser combinator libraries

Writing utility scripts often involves parsing some kind of output.

Many people resort to using regular expressions if they need to parse text. Sometimes they’re a perfect fit, but a regex can quickly reach a complexity threshold where you wish you had used something else.

One (in my opinion underutilized) alternative to regular expressions are parser combinator libraries. Parser combinator libraries allow you to write a parser in terms of individual small parse functions that you can then combine together to form the full parser. In some languages using them is a bit clunky, but the feature set of Haskell makes it convenient to use them.

Can you guess what the following code does?

literal :: Parser Expr
literal = number <|> stringLiteral <|> objectLiteral <|> arrayLiteral


stringLiteral :: Parser Expr
stringLiteral = StringLiteral . T.pack <$> string
  where
    quote = char '\'' <|> char '"'
    string = between quote quote (many (noneOf "\'\""))

How about this:

parseEntry :: Parser Entry
parseEntry = do
  char '-'
  char ' '
  date <- parseDate
  char ':'
  char ' '
  hours <- parseHours
  char ' '
  intervals <- between (char '(') (char ')') parseIntervals
  pure $ Entry date hours intervals
  where
    parseIntervals = parseInterval `sepBy` char ','

Code evaluation with Haskell-language-server

Remember the point about testing?

The haskell-language-server supports evaluating code snippets in the documentation:

Haskell language server evaluation demo

This doesn’t replace a proper test suite for a serious project, but it makes it ergonomic to quickly verify within a script if a function behaves the way you want it to. You can achieve something similar using an interactive REPL and sending code snippets to it, but I prefer this workflow.

Wrap up

Although any single point here may not convince you, all of them put together turn Haskell into a powerful scripting language.

If you’re looking for a low risk way to learn the language, I can only recommend to start out using it for ad-hoc scripts or small utility programs.