Typesafe Entities – Part I

Context

Entities as we know are things of noun-form, that exist in a given problem domain. For example, a Personal System of Records that contains entities such as Person or Individual, Address, Contact, GeoLocation, etc. Semantic structure of these entities is a composition of primitive and composite types. In certain design practices, folks also include behaviors as part of the entity definition itself. While, most recent paradigms such as FP and type theories seem to deal with this requirement via polymorphic method implementations, where behaviors are attached to the types, based on the use case need.

Irrespective of the choice of a particular design methodology, resilience of the model design is essential for a successful solution implementation.  Most of the functional regression issues can be traced back to some kind of inconsistencies within the data layer. Non-functional aspects are a slight variant of this situation, but can be governed similarly with better configuration choices. To circumvent the issues, implementors often find themselves stuffing the entity and application code with additional boiler-plate, only to see it getting out-of-hands in due course. This leads us to the point we’d like to cover in this article – Type-safe Entities. For those disciplined developers this may sound more familiar, in which case you can skip this article.

For the scope, we will restrict ourselves to type-safe domain models and how certain traits can be enforced to achieve common expected behavior across all known entity types. For reference, will use a sample entity from a personal records management space – a Person object. Here is the canonical representation of this entity:

Person{
    id : Me1024
    name : Mr.SomeBody
    height : 5.4
    address : {
        streetNumber : 12345
        streetName : StreetThatILiveOn
        cityName : myFavoriteCity
        zipCode : 123ZYX
        country : myFavoriteCountry
    }
    contactInfo : {
        phoneNumber : 1234567890
        emailAddress : hello@world.com
        webSite : www.world.com
    }
}

As you can see here, attributes in an entity set is a mixed bag of primitives and non-primitives. The structure follows HOCON (Human Optimized Config Object Notation) format in principle. HOCON is considered to be a superset of JSON style.

Some goals

For the context, will describe certain criteria that we want model implementations to consider, to assert object definitions as type-safe:

  • Types are asserted during compile-phase (i.e., via static bindings), including invalid copy signatures.
  • Data rules are validated during runtime object instantiation
    • Macro expansions would support this intent, with a graceful approach.
  • Deserialization supports validation of object namespace assignment, by design. This requires serialization to consider preserving object namespace information as part of serialized representation.
  • Types should support extensibility, subject to needs of the problem domain, such that additional regression is avoided.
  • Avoid boiler-plate as much as possible.

In the subsequent sections we will see an intent based approach to enforce some of these criteria, into our design and implementation. We will use features such as annotations, compile-time macros and reflection.

Key Jargon

Here is a set of definitions for some of the key jargon that we will refer:

  • Annotation – Tags within the code that describe our intent for our respective code block or object definition.  For example, methods that we want to optimize for tail-recursion are tagged using the @tailrec annotation.
  • Macro – An instruction set to the language compiler to add additional behaviors or logic.
  • Trait – similar to an interface, a trait abstracts the behaviors that objects of similar nature exhibit.
  • Type-classes – conceptually a feature that aides in introducing ad-hoc polymorphic behaviors on types, without actually needing to modify the original implementation or definition.

Process

Equipped with an data sample and some related jargon, let’s take a look at what we’d like to achieve. Here’s the conceptual view:

Compile time enrichment flow!
Compile time enrichment flow!

Let’s take a look at what is happening above:

  • Annotate entity classes with the annotation – ItsAGBoxADT.
    • @ItsAGBoxADT(doesValidations=true)
      • doesValidations argument will hint the macro expansion logic that the corresponding ADT has logic to run attribute validation checks, for example hints presence of the require(...) method application.
      • In which case, the macro expansion will result in enriched def apply(...) and def copy(...)  that would return an Option[ADT], rather an ADT instance. This will allow us to gracefully handle IllegalArgumentException.
    • @ItsAGBoxADT() or @ItsAGBoxADT
      • This is a simple scenario, where the ADT is assumed to have no require(...) logic and that no-side affects need to be managed, when instantiating the type.
  • case class variant of class definitions
    • case class is a syntactic sugar provided by Scala, where in the compiler would define corresponding companion object definitions, and default hash code, equals and copy features to the class definition.
    • The expansion logic would fail the compilation, if the annotated class is not a case class.
  • Use of gBoxADTBase[T]
    • All annotated ADT/Entity classes would extend this base type.
    • This helps to inject necessary helper methods, such as an overridden toString() method to offer a formatted serialized representation.
  • Macro Expansion
    • Expansion i.e., augmenting the annotated classes with necessary features including extending the base type, appropriate apply(…) and copy(…) methods, will be applied by the macro paradise compile-time plugin.
    • Expansion logic works on the AST (Abstract Syntax Trees) entities, than the code base as we see in the user-land i.e., hand-written code. Based on the requirement one could extend necessary support. We will use what is called as def macros, whose implementation extends scala.annotation.StaticAnnotation.

Sample Application

Here’s the set of ADT’s, annotated with @ItsAGBoxADT:

@ItsAGBoxADT(doesValidations = true)
case class Address(streetNumber:Int, 
                    streetName:String, 
                    cityName:String, 
                    zipCode:String, 
                    country:String){
    require(streetNumber > 0 &&
        Option(streetName).isDefined && streetName.nonEmpty &&
        Option(cityName).isDefined && cityName.nonEmpty &&
        Option(zipCode).isDefined && zipCode.nonEmpty &&
        Option(country).isDefined && country.nonEmpty)
}

@ItsAGBoxADT(doesValidations = true)
case class ContactInfo(phoneNumber:String, 
                        emailAddress:String, 
                        webSite:String){
    require(Option(phoneNumber).isDefined && 
        phoneNumber.nonEmpty && phoneNumber.length == 10 &&
        Option(emailAddress).isDefined && emailAddress.nonEmpty &&
        Option(webSite).isDefined && webSite.nonEmpty)
}

@ItsAGBoxADT(doesValidations = true)
 case class Person(id:String, 
     name:String, 
     height:Double, 
     address:Address, 
     contactInfo:ContactInfo){
     require(Option(id).isDefined && 
         id.nonEmpty && id.length == 6 &&
         Option(name).isDefined && name.nonEmpty &&
         height > 2.0 && height < 8.0 &&
         Option(address).isDefined && 
         Option(contactInfo).isDefined)
}

In here, we are trying to do few things…

  • Model domain objects as case classes  i.e., Address, ContactInfo and Person.
  • Add basic attribute level validation checks for each of those definitions. These criteria are usually derived from the problem domain, along with certain non-functional criteria, such as not-null or non-empty, etc.
  • For shorter code i.e., show-off FP conciseness (candid attribution), we can move (i.e., refactor) such validations into external traits using type-classes and invoke during application, using constructs such as person.isValid. Here, the isValid would consider objects of type gBoxADTBase[T], fetch necessary validation criteria from a rules systems of sorts and returns a boolean value.

Let us now look at a sample unit test to validate the implementation:

class TestItsAGBoxADT extends FlatSpec with Matchers {
  //Test case
  it should "initialize person object and serialize to HOCON" in {
    //Initialize Address object
    val address = Address(streetNumber = 12345, streetName = "StreetThatILiveOn",
      cityName = "myFavoriteCity", zipCode = "123ZYX", country = "myFavoriteCountry")
    //Initialize ContactInfo object
    val contact = ContactInfo(
      phoneNumber = "1234567890",
      emailAddress = "hello@world.com", webSite = "www.world.com")

    //Check if we have valid address and contact, if then, create Person object
    (address, contact) match {
      case (Some(validAddress), Some(validContact)) => {
        Person(id = "Me1024", name = "Mr.SomeBody", height = 5.8, address = validAddress, contactInfo = validContact) match {
          case Some(validPerson) => {
            validPerson.copy(id = "Me9999", name = "Mr.AnyBody",
              height = 5.8, address = validAddress, contactInfo = validContact) match {
              case Some(validCopy) => {
                println(s"Serialized valid person copy is: \n ${validCopy.toString}")
                assert(!validPerson.equals(validCopy))
              }
              case None => assert(false, "Failed to get valid copy of Person object")
            }
          }
          case None => assert(false, "Unable to instantiate a valid Person object")
        }
      }
      case (_, _) => assert(false, "Unable to initialize valid address and/or contact objects.")
    }
  }
}

The testing framework is based on ScalaTest framework. Scope here is to validate successful instantiation of the afore-mentioned types and generate a serialized HOCON formatted representation of the Person object, with an expectation to preserve the hierarchical structural composition of the entities.

Here’s the sample CLI-output:

$ sbt "testOnly TestItsAGBoxADT"
[info] Loading project definition from /cdev/gBoxConversationalPipes/gBoxCPCommons/project
[info] Set current project to root (in build file:/cdev/gBoxConversationalPipes/gBoxCPCommons/)
[info] Passed: Total 0, Failed 0, Errors 0, Passed 0
[info] Passed: Total 0, Failed 0, Errors 0, Passed 0
[info] No tests to run for macros/test:testOnly
[info] No tests to run for root/test:testOnly
[info] Compiling 1 Scala source to /cdev/gBoxConversationalPipes/gBoxCPCommons/core/target/scala-2.12/test-classes...
Processing macroExpansion for ADT -> <empty>.Address, that also handles validation - true
Processing macroExpansion for ADT -> <empty>.ContactInfo, that also handles validation - true
Processing macroExpansion for ADT -> <empty>.Person, that also handles validation - true
Serialized valid person copy is:
 Person {
 Address {
 cityName=myFavoriteCity
 country=myFavoriteCountry
 streetName=StreetThatILiveOn
 streetNumber=12345
 zipCode="123ZYX"
 }
 ContactInfo {
 emailAddress="hello@world.com"
 phoneNumber="1234567890"
 webSite="www.world.com"
 }
 height=5.8
 id=Me9999
 name="Mr.AnyBody"
}

[info] TestItsAGBoxADT:
[info] - should initialize person object and serialize to HOCON
[info] ScalaTest
[info] Run completed in 340 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[info] Passed: Total 1, Failed 0, Errors 0, Passed 1
[success] Total time: 7 s, completed Mar 4, 2018 9:22:43 AM

Proof Point

Interesting part from the above output are the following lines, that highlight some detail about the macro-expansion debug output (shown, when there’s something to compile; but if already compiled, whenever you’d run the tests, you would not observe these debug lines) and the formatted HOCON style serialized representation of the Person object:

Processing macroExpansion for ADT -> <empty>.Address, that also handles validation - true
Processing macroExpansion for ADT -> <empty>.ContactInfo, that also handles validation - true
Processing macroExpansion for ADT -> <empty>.Person, that also handles validation - true
Serialized valid person copy is:
 Person {
     Address {
         cityName=myFavoriteCity
         country=myFavoriteCountry
         streetName=StreetThatILiveOn
         streetNumber=12345
         zipCode="123ZYX"
     }
     ContactInfo {
         emailAddress="hello@world.com"
         phoneNumber="1234567890"
         webSite="www.world.com"
     }
     height=5.8
     id=Me9999
     name="Mr.AnyBody"
}

Few observations to make here:

  • Unlike our initial canonical representation, here the formatting is controlled by the library of our application – Lightbend Config (its earlier name was TypeSafe Config). The library has a wide usage, specially in most Typesafe/Lightbend tools and implementations.
  • The order of attributes is slightly different – it is now following a chronologically sorted sequence. Reason being, the serialization support in gBoxADTBase[T] uses a TreeMap to get a sorted order of the Key-Value map of the attributes. It is intentional. You can avoid unnecessary sort logic in a serious implementation, if it is not what you intend your logic to do. Saves some resources.
  • And, the quotes around strings – this is the library magic to deal with special characters in the formatted representation.

The Magic

Let’s take a quick look at the macro-magic that happens behind the scenes i.e., under-the-hood. Here’s the code dump of the macro logic that you’d see as a tagged annotation on respective entities. It is a bit verbose with potential for further optimization (there are few reasons for that too, why it is in what it is now):

import scala.language.experimental.macros
import scala.annotation.StaticAnnotation
import scala.annotation.compileTimeOnly
import scala.reflect.macros.whitebox.Context
import gBoxMacroUtils._

@compileTimeOnly("enable macro paradise to expand macro expansion")
class ItsAGBoxADT(doesValidations: Boolean) extends StaticAnnotation {
    /**
     * Hint to the macro engine about annottees that this macro would transform.
     * Refer to https://docs.scala-lang.org/overviews/macros/annotations.html for additional details
     */
    def macroTransform(annottees: Any*): Any = macro ItsAGBoxADTMacro.impl
}

//Object where the macro expansion is defined.
object ItsAGBoxADTMacro {

    /**
     * macro transformation implementation goes here!
     *
     * Idea here is for any given class or case class, within the name space - me.ganaakruti,
     * introduce a copy(...) method that matches the primary constructor of the annottee!
     */
     def impl(ctx: Context)(annottees: ctx.Expr[Any]*): ctx.Expr[Any] = {
        //Expose compiler API
        import ctx.universe._

        val annotationHandlesValidation: Boolean = ctx.prefix.tree match {
          case q"new ItsAGBoxADT(doesValidations = $handlesValidation)" => ctx.eval[Boolean](ctx.Expr(handlesValidation))
          case q"new ItsAGBoxADT($handlesValidation)" => ctx.eval[Boolean](ctx.Expr(handlesValidation))
          case _ => false
        }

        //TODO Need to add support for companion object inclusion in the below match...case
        val result = annottees.map(_.tree) match {
          case (classDecl: ClassDef) :: Nil => {
            println(s"Processing macroExpansion for ADT -> ${fetchFQCN(cName = classDecl.name.toString, ctx = ctx)}, that also handles validation - $annotationHandlesValidation")
            classDecl match {
              //Continue only if the class declaration is of type `case class`
              case q"$mods class $tpname[..$tparams] $ctorMods(...$paramss) extends { ..$earlydefns } with ..$parents { ..$body }" if mods.hasFlag(Flag.CASE) => {
                paramss.map { paramList => paramList.map { case q"$_ val $param: $_ = $_" => q"$param" } } match {
                  case someArgsList: List[List[Tree]] => {
                    q"..$someArgsList".nonEmpty match {
                      case false => {
                        abortWithMessage(
                          ctx,
                          s"Parameter count check failed. Annottee must have at-least one parameter in it's constructor.")
                      }
                      case true => {
                        val implObjName = TermName(s"${tpname}")
                        annotationHandlesValidation match {
                          case true => {
                            //TODO - need to extract the body of the companion object
                            q"""
                              $mods class $tpname[..$tparams] $ctorMods(...$paramss)
                                  extends { ..$earlydefns }
                                  with me.ganaakruti.gboxcp.commons.adt.gBoxADTBase[$tpname]
                                  with ..$parents {

                                def copy(...$paramss):Option[$tpname] = try{ Some(new $tpname(...$someArgsList)) }catch{ case ex:Throwable => None }
                                ..$body
                              }

                              object $implObjName{
                                def apply(...$paramss):Option[$tpname] = try{ Some(new $tpname(...$someArgsList)) }catch{ case ex:Throwable => None }
                              }
                            """
                          }
                          case false => {
                            q"""
                              $mods class $tpname[..$tparams] $ctorMods(...$paramss)
                                  extends { ..$earlydefns }
                                  with me.ganaakruti.gboxcp.commons.adt.gBoxADTBase[$tpname]
                                  with ..$parents {
                                def apply(...$paramss) = new $tpname(...$someArgsList)
                                def copy(...$paramss) = new $tpname(...$someArgsList)
                                ..$body
                              }
                            """
                          }
                        }
                      }
                    }
                  }
                }
              }
              case _ => abortWithMessage(ctx = ctx, msg = "Annottee must be a `case class`")
            }
          }
          case (classDecl: ClassDef) :: (moduleDef: ModuleDef) :: Nil => {
            abortWithMessage(ctx = ctx, msg = "Macros with pre-defined companions is not yet supported.")
          }
          case _ => abortWithMessage(ctx = ctx, msg = "Invalid annottee. Cannot expand the macro ItsAGBoxADT.")
        }
        //return the result
        ctx.Expr[Any](result)
      }
  }

Next steps & After-thoughts

  • Macro programming does require switching gears, as the objects that you manipulate are effectively the AST (Abstract Syntax Trees).
    • It is not typical user-land any more. But comes packed with some niceties like quasiquotes (see those wired string interpolators… `q”…”`). They’ll help you in constructing and de-constructing the ASTs, without having to get lost in the forest of parenthesis.
    • You are not going to think, in terms of objects, state manipulations, etc., rather it should be focused on types and type-interactions.
    • Runtime reflection is all about understanding the concept of mirrors.
    • Annotations must be defined based on WhiteBox macro context, rather blackbox (a tip from the original documentation).
    • And the API, you need to familiarize yourself with it, as there’s a good deal of dynamic inter-lacing that happens, based on what you are trying to do.
    • This article tries to summarize, if not simplify the notion of meta-programming from a user-land abstraction stand point.
  • Code base is part of the experimentation. Corresponding code base will be extracted and shared via open code repositories (specifics are TBD).
  • There are some known macro plugin related issues associated with ability to use Scala macros plugin in REPL. Please glance through the link below on macro paradise plugin issues in console.
  • We’ll see some examples of type-classes, using which will add additional behaviors to the vanilla entity definitions. Such behaviors will be applied to achieve our goal i.e., ad-hoc polymorphic behavior tagging.

References

Leave a Reply