Saturday, July 14, 2012

Parsing TMX Maps using Scala

A co-worker and friend of mine was recently talking about how he wanted to make games on a phone.  He's a fan of turn-based style board games.  He was lamenting, though, at how much work it would take just to create things like maps and backgrounds, let alone characters and the like.  Of course, there are tools emerging to aid in the building of these things.  One of them is Tiled.  Tiled allows you to create maps from tile set image and store the resulting map definition in an XML file.  This format is called TMX Map Format.

I've recently been playing around with some Scala, and I figured since I liked Scala's XML support so much, I would create a little library that would parse a TMX file, represent the elements as Objects and allow you pretty easily render a Tiled map.

Who knows, maybe I will eventually turn it into a full-fledged Game framework.  For now, though, I'm concentrating on just the TMX parsing and utilities to allow you get the map info into your program.

The XML.

Since Scala's XML parsing is first-class, it was pretty easy to get started.  As I started looking through the documentation and writing tests, a few interesting portions of the code started to pop up.  First, it appears that there are several ways to express the ids of the tiles (each ID representing a 'tile' in the image it references).  The formats are: XML, csv, and Base64.

In addition to the different formats, the Base64 data could optionally be compressed using gzip or glib.  This presented me with the most interesting portions of the code.  First, the XML parser needed to be smart enough to create the Layers correctly depending on the formats.  Second, I needed to do some data manipulation to interpret the Base64 data the correct way.

Let's talk about the XML parsing depending on the different tags.  As it turns out, there was very little code for me to write to parse the XML, even with the variable data format.  What's interesting, is that I was able to make the parser return an immutable map object, creating each element and all sub-elements while constructing the hierarchy.  This is in keeping with my 'least mutable state' exercises I've been working on.  Here is a sample of the parsing code:



def loadMap(src: InputStream): TmxMap = {
    val xml = XML.load(src)
    new TmxMap((xml \ VERSION).text,
      (xml \ ORIENTATION).text,
      (xml \ WIDTH).text.toInt,
      (xml \ HEIGHT).text.toInt,
      (xml \ TILE_WIDTH).text.toInt,
      (xml \ TILE_HEIGHT).text.toInt,
      xml \ TILE_SET collect { case tileSet: Node => new TmxTileset((tileSet \ FIRST_GID).text, (tileSet \ NAME).text, getValue((tileSet \ TILE_WIDTH).text), getValue((tileSet \ TILE_HEIGHT).text), createImage((tileSet \ IMAGE).first)) },
      xml \ LAYER collect {
        case layer: Node =>


          if (!(layer \ DATA \ ENCODING).text.isEmpty()) {


            TmxLayer.fromData((layer \ NAME).text, getValue((layer \ WIDTH).text), getValue((layer \ HEIGHT).text), (layer \ DATA \ ENCODING).text, (layer \ DATA \ COMPRESSION).text, (layer \ DATA).text);
          } else {
            TmxLayer.fromTiles((layer \ NAME).text, getValue((layer \ WIDTH).text), getValue((layer \ HEIGHT).text),
              layer \ DATA \ TILE collect { case tile: Node => new TmxTile(getValue((tile \ GID).text)) })
          }
      },
      xml \ OBJECT_GROUP collect {
        case objectGroup: Node => new TmxObjectGroup((objectGroup \ NAME).text, getValue((objectGroup \ WIDTH).text), getValue((objectGroup \ HEIGHT).text),
          objectGroup \ OBJECT collect {
            case obj: Node => new TmxObject((obj \ NAME).text, getValue((obj \ X).text), getValue((obj \ Y).text), getValue((obj \ GID).text),
              obj \ POLYGON collect { case poly: Node => new PropertiesFactory[TmxPolygon].setProperties(poly \ PROPERTIES, new TmxPolygon(parsePoints((poly \ POINTS).text))) })
          })
      })
  }



What should pop out at you right away is that the loadMap function essentially has two statements.  One loads the XML, the other creates a new TmxMap.  There is a LOT going on here and I often consider breaking this up a little into smaller methods so its more readable and digestible, but its interesting to see how you can create compound statements in Scala.

The next most interesting thing is that if the data comes across as Base64, it is supposed to be interpreted as a byte array of unsigned little-endian integers.

Well, Java and Scala do not have unsigned types, so this required a little bit of conversion.  Interestingly, my first attempt used Sun's undocumented decoder.  I ran into an interesting issue with it.  It did not remove the carriage returns from my XML data before passing it to the decoder.  The decoder returned values, but they were wrong.  I re-implemented the function using the Apache Commons decoder, and it works just fine.  This leads me to wonder, though.  Which is more correct?  Should I remove all carriage returns from my data before parsing or should a decoder be able to recognize that?  I'm not sure which I would prefer.

I've created a project for this called, TMX-S-Parser on github and I intend to keep adding to it until I'm satisfied it is robust and complete.  Take a look at the code.  If you have the will, copy the repo, make some changes, recommend improvements, add utility.