Wednesday, May 16, 2012

Scala Nugget – Pattern matching and Lists


I was whining recently about how my scala-code is java in poor disguise. I started reading the scala by example pdf that also comes with the scala installation. I just read some interesting things about lists and pattern-matching that gave me an idea how to “scalafy” the following scjava-code:
  def importResource(name:String, resource:Resource):Unit = {
    log.debug("importing " + name + " into " + root)
    val path = pathOf(name)
    createResource(path.toList)(resource)
  }

  private def createResource(nodes:List[String])(resource:Resource) =  {
    val directory = nodes.dropRight(1).foldLeft(root)((directory, name) => {
      if(name.equals("")) {
        directory
      }
      else {
        val directoryOption:Option[AbstractDirectory] = directory.getDirectory(name)
        directoryOption.getOrElse({
          val subDir = directory.createDirectory(name) match {
                         case result:AbstractDirectory => result
                       }
          directory.add(subDir)
          subDir
        })
      }
    })
    directory.createIfNewer(nodes.last, _:Resource)
  }
If it’s not crystal clear to you what the code does. Here is an overview:
  • The first method takes a name and a resource
  • The name is split up into it’s path-elements
  • The path is passed to the createResource method that will create all directories that does not yet exist on the way to the resource and finally return a function that takes a resource as input and creates it in the already given directory.
A big issue I have with the code is the createResource-method. It simply is very hard to name it. createAllDirectoriesAndReturnResourceCreatingMethod would illustrate better how smelly that method really is. I have an idea on how to refactor this with scalas pattern matching. This is the new version (after some red/green bar cycles):
  def importResource(name:String, resource:Resource):Unit = {
    log.debug("importing " + name + " into " + root)
    val path = pathOf(name)
    createResource(path.toList, root, resource)
  }

  private def createResource(nodes:List[String],
                             directory:AbstractDirectory,
                             resource:Resource) {
    nodes match {
      case head :: Nil => directory.createIfNewer(head, resource)
      case "" :: tail => createResource(tail, directory, resource)
      case head :: tail => {
        val subDir = directory.getDirectory(head).getOrElse {
          val result = directory.createDirectory(head) match {
                                   case dir:AbstractDirectory => dir
                                 }
          directory.add(result)
          result}
        createResource(tail, subDir, resource)
      }
      case nil =>
    }
  }
The first thing that strikes me about the second version is the unbalance between the different cases. Thehead :: tail case is not exactly a one-liner… but it should be. This smells like feature envy. We could ask the directory to getOrCreateDirectory and it would read much better. But that is another refactoring. First let’s go through this one:
One thing I did not first get in scala is how you can iterate through lists using pattern-matching. In order to get how that works it is important to realise one thing about scalas lists:
List(“a”, “b”, “c”, “d”)
is equivalent to
“a” ::(“b” ::(“c” ::(“d” ::(Nil))))
which due to the right associativityness of :: is equivalent to:
“a” :: “b” :: “c” :: “d” :: Nil
Or plain english: Lists are not flat structures, lists are recursive structures. What that means in practical terms is that given any element in the list it is very easy to split the list at that element into head (the current element) and tail (the rest of the elements.
For instance, given the element “b” above, head will be “b” and tail will be “c” :: “d” :: Nil.
Ok, not that hard. Now to the cool part: The :: operator is a case class which in short means that it can be used in pattern matching:
“a” :: “b” :: “c” :: Nil match { case h :: t => println(head +” -> ” +tail); …}
will assign “a” to the variable h and “b” :: “c” :: nil to the variable t
We can now understand the cases above. In pseudocode:
  private def createResource(nodes:List[String],
                             directory:AbstractDirectory,
                             resource:Resource) {
    nodes match {
      case head :: Nil =>  //the last element of the list, create the resource using head as a name
      case ""   :: tail => // special case, an empty directory name. Skip to the next name:
                           // createResource(tail, directory, resource)
      case head :: tail => //head will be a directory-name that we use to get or create the 
                           //nextDirectory that is used with tail to call ourselves recursively:
                           // createResource(tail, nextDirectory, resource)
      case nil => //no more elements to traverse
    }
  }
I’m far from done with the refactoring. I want to get rid of the special case of “” to begin with and as mentioned above remove some feature-envy but I think that the cases are more readable than the original code. At least in terms of where the problems lie in terms of special cases and bloated cases. Of course, it’s just an opinion and I reserve the right to change my mind tomorrow :)

Written by johlrogge

Digg Google Bookmarks reddit Mixx StumbleUpon Technorati Yahoo! Buzz DesignFloat Delicious BlinkList Furl

0 comments: on "Scala Nugget – Pattern matching and Lists"

Post a Comment