Category Archives: Groovy & Grails

All things related to the Groovy programming language and Grails web application framework.

Adding Spring Security users in bulk (in Grails)

Earlier this week I was at a client trying to help them diagnose some issues with a CSV import process. I was of course aware of Ted Naleid’s seminal blog post on bulk updates in Grails, and the issues mentioned there seemed the most likely culprits. Unfortunately, it didn’t turn out nearly as straightforward as I’d hoped.

We started with VisualVM, looking for any obvious problems with the memory usage. Nothing showed up, and in fact the import wasn’t creating a lot of records anyway. We progressed on to JProfiler and P6Spy, hoping to see some hotspots in the code or particularly slow queries. We did identify a couple of places that seemed to be taking the majority of the time, but it still wasn’t clear to me whether the issue was in code, Grails, Hibernate, or the database.

That day we implemented a workaround that shifted some work into a background thread using the Platform Core event bus. This was a reasonable thing to do anyway, considering the requirements of the business logic. Yet I was left still wondering why certain parts of the import process were fundamentally slow.

It bugged me enough that I decided to investigate one of the major culprits of the slow import: the Spring Security Core plugin’s UserRole.create() method. Perhaps I could reproduce the problem in a small project without the complexity of the client project. It seemed simple enough to be worth a try. And so I created a new Grails 2.3.7 project with Spring Security Core installed and the following controller action:

@Transactional
def createUsers()
    def startTime = System.currentTimeMillis()
    
    def roles = (0..<60).collect {
        new Role(authority: RandomStringUtils.randomAlphabetic(12)).save()
    }
    
    def users = (0..<100).collect {
        new User(
                username: RandomStringUtils.randomAlphabetic(12),
                password: RandomStringUtils.randomAscii(20)).save()
    }
    
    for (r in roles) {
        for (u in users) {
            UserRole.create u, r, false
        }
    }

    println "Total: ${(System.currentTimeMillis() - startTime) / 1000} s"

    redirect uri: "/"
}

To my relief, this took more than 30 seconds to complete on the first run. That seemed a lot slower than it should considering it’s only creating a total of 760 records. There was obviously some underlying issue here that I wasn’t seeing. I tried to clear and flush the session every 20 iterations, but that didn’t have a significant impact.

My next step was to simply create 760 Role records and then, independently, 760 User records. Both of these only took a few seconds. So what was special about UserRole? Why did its creation seem to be so expensive? I wanted to eliminate the database as a problem, so I tried using Groovy SQL (basically native JDBC) for the UserRole persistence. The total time dropped to a few seconds. So not the database then.

A Google search brought up another blog post on inserting data via Grails, by Marc Silverboard. In addition to using native JDBC, he suggests using a Hibernate stateless session. This sounded like an interesting possibility, so I shoe-horned it into my test action:

def createUsers()
    ...
    def session = sessionFactory.openStatelessSession()
    def tx = session.beginTransaction()
    def counter = 0
    for (r in roles) {
        for (u in users) {
            session.insert(new UserRole(user: u, role: r))
            counter++
            if (counter % 20 == 0) {
                session.flush()
            }
        }
    }
    tx.commit()
    session.close()
    ...
}

It’s certainly uglier code, partly as I decided to do batch flushing every 20 rows (I also configured Hibernate’s JDBC batch size to 20). The results were worth it: the total import time came down to just 1 second! Obviously the issue was Hibernate’s caching in the session. Conundrum solved. I was still left wondering why the caching was such an issue only for UserRole, but that was a question for another time.

It would have been easy to stop at this point and bask in the glow of a job well done. Unfortunately, that’s not really me. With my engineering background, I did wonder whether the new code was bypassing more than just Hibernate’s caching. And then I remembered validation. Could validation be the real issue? In order to isolate that particular feature, I reverted all the code back to its original state and then modified the UserRole.create() method to use the validate: false option. I restarted the server and then clicked on the link that triggered the user creation. 10 seconds! I did it again. 6 seconds. After a few more times it settled down at just under 4 seconds. Wow.

Why is validation such an issue on UserRole? I have no idea. I did give deepValidate: false a go, but it didn’t show nearly as big an improvement as switching off validation completely. Maybe one of my readers understand what’s going on and can provide us with the answer. Or perhaps not knowing will bug me enough to get me looking deeper. But for now, I just want to summarise my findings:

  • Grails and Hibernate have a lot of moving parts – it can be hard to diagnose issues
  • You really do need to invest some quality time and rigour for any diagnosis phase
  • Hibernate stateless sessions are work investigating for any bulk inserts
  • Validation could be a significant hidden problem – try disabling it

I think in this case, the drop from 4s to 1s may make it worthwhile using a stateless session. But in either case, be sure you can do without the validation! And I hope this helps you with your own GORM bulk insertions, either with the diagnosis or the solution.

Contributing to the Groovy documentation

I like contributing to open source projects. I also love using Groovy for programming. Unfortunately, contributing to programming languages scares me because of all the grammar and parser stuff. I’m sure I could get into the internals with time, but I feel that time is better spent elsewhere. Now, one of those places is the Groovy user guide.

Groovy has been without a proper user guide for a long time now. Yes, there are various pages on the wiki with useful information, but it’s mostly unstructured. So the announcement of a full-blown user guide with language specification filled me with anticipation. And recently, the penny finally dropped and I realised this is something that I can contribute to. I know how to write Groovy, so all that’s required is a little bit of writing.

Continue reading

Shared Grails JARs for Tomcat deployment

While I was at GR8Conf US, one of the attendees asked me how to deploy two Grails WAR files to Tomcat without running into the dreaded “out of permgen space” error. This problem stems from Grails apps loading a lot of classes, and each webapp gets its own copy of those classes. So that’s pretty much double the permgen usage when you deploy two Grails WARs to a single Tomcat instance.

The common solution to this problem is to put the library JARs common to all Grails applications into Tomcat’s shared lib directory. Then there will only be one copy of the corresponding classes loaded in the VM regardless of how many webapps are deployed. It’s a pretty neat solution considering how many common JARs there are between Grails apps, but Grails throws in an additional challenge in that some per-application state is actually per-VM state. So deploying more than one Grails WAR into a Tomcat with shared Grails JARs can cause issues.

A quick web search brings up this question on StackOverflow with a corresponding list of the JARs that can be shared and those that can’t. Certainly for Grails 2.0+, it seems that only the grails-* JARs are unsafe, so I came up with a short events script that splits the JARs, putting the Grails ones in the WAR file and the rest in a sharedLibs directory:

eventCreateWarStart = { warName, stagingDir ->
    if (grailsEnv == "production") {
        def sharedLibsDir = "${grailsSettings.projectWorkDir}/sharedLibs"

        ant.mkdir dir: sharedLibsDir
        ant.move todir: sharedLibsDir, {
            fileset dir: "${stagingDir}/WEB-INF/lib", {
                include name: "*.jar"
                exclude name: "grails-*"
            }
        }

        println "Shared JARs put into ${sharedLibsDir}"
    }
}

Note that this fragment goes into the scripts/_Events.groovy file in the Grails project. I hope it helps folks!

Where next for Grails?

A time comes for every open source project when it has to take a step back, reflect on the past and decide where it needs to go next. The world rarely stays the same as when the project was born and in the tech world things change year on year. Back when Grails first came out, Java web frameworks were still a pain to configure and you had to pull together a whole host of libraries to do what you wanted. The use of conventions was rare and ease of use didn’t seem to be a high priority.

Now in 2013 there are probably hundreds of web frameworks for the JVM, most of which have learned lessons from Rails and its ilk. In addition, we have Node.js spawning lookalikes and derivatives (such as Meteor) aiming to provide lightweight servers that can handle hundreds of thousands of concurrent connections through asynchronous IO. In this world, what is Grails’ value proposition and where should it be heading?

This is a pretty long post, so if you want to skip straight to my suggestions, head on down to the conclusion at the end of the article. But I’m hopeful you’ll find value in the rest of the article!

Continue reading

DRY JSON and XML with Grails

Have you ever tried to support both JSON and XML in your REST API with Grails? There is the very straightforward:

class MyController {
    def index = {
        def objs = ...
        withFormat {
            json {
                render objs as JSON
            xml {
                render objs as XML
            }
        }
    }
}

It works well and is trivial to implement, but it does suffer from a significant problem: you have no control over the JSON or XML generated. So if you’re serialising domain classes, any change to your internal domain model will be reflected in the public REST API. That is Not A Good Thing.

You might then look into using the JSON and XML builders instead, but you will soon discover that they have different syntax and the structure of JSON and XML is different anyway. So you find yourself writing separate code to render JSON and XML for each action that’s part of your REST API. That’s more work than you really want.

Can you somehow marry the two to get the best of both worlds: minimal coding but control over what goes into the public API? If you don’t mind a few constraints on the output, I think you can. My proposal boils down to:

  1. Transform the data in nested maps and lists of maps, filtering out anything you don’t want in the public API
  2. Use ‘... as JSON‘ for JSON responses
  3. Use a custom method for rendering the same data as XML

Let’s take an example from the grails.org application. I want to generate a list of plugins in both JSON and XML forms so that consumers have a choice of format. The first step is easy: get a list of the plugins I want to render from the database. The next step involves transforming those plugin domain instances into a hierarchical structure based on maps. Here’s the code I came up with:

class PluginController {
    ...
    protected transformPlugins(plugins) {
        return [ pluginList: plugins ?
                plugins.collect { p -> transformPlugin(p) } :
                [] ]
    }

    protected transformPlugin(plugin) {
        def pluginMap = [
                name: plugin.name,
                version: plugin.currentRelease,
                title: plugin.title,
                author: plugin.author,
                authorEmail: plugin.authorEmail,
                description: plugin.summary,
                grailsVersion: plugin.grailsVersion,
                documentation: plugin.documentationUrl,
                file: plugin.downloadUrl,
                rating: plugin.avgRating ]
            
        if (plugin.issuesUrl) pluginMap.issues = plugin.issuesUrl
        if (plugin.scmUrl) pluginMap.scm = plugin.scmUrl

        return pluginMap
    }
    ...
}

The dynamic nature of Groovy and its expressiveness make the transformation pretty simple to effect. This could probably be simplified even further if we had something similar to the bindData() method that would map all properties of a domain instance to Map entries except those specified in an exclusion list.

Rendering the plugins as JSON then becomes a simple matter of:

def plugins = Plugin.list(...)
render transformPlugins(plugins) as JSON

since JSON maps perfectly to the map structure I’ve created. XML is a trickier proposition because it doesn’t have a straightforward concept of objects and lists. We could use as XML, but then we would end up with a whole bunch of <map> and <list> elements. No, we have to use a different approach.

I plumped for using the render() method in its XML builder form:

class PluginController {
    ...
    protected renderMapAsXml(map, root = "root") {
        render contentType: "application/xml", {
            "${root}" {
                mapAsXml delegate, map
            }
        }
    }

    protected mapAsXml(builder, map) {
        for (entry in map) {
            if (entry.value instanceof Collection) {
                builder."${entry.key}" {
                    for (m in entry.value) {
                        "${entry.key - 'List'}" {
                            mapAsXml builder, m
                        }
                    }
                }
            }
            else {
                builder."${entry.key}"(entry.value, test: "test")
            }
        }
    }
    ...
}

So now rendering the plugins as XML becomes:

def plugins = Plugin.list(...)
renderMapAsXml transformPlugins(plugins), "plugins"

where the second argument is the name of the root element in the generated XML. The great thing is, this method can be applied to _any_ transformed data as it’s not specific to plugins.

One thing to note is that the mapAsXml() method assumes that the name of any map key that has a list as its value consists of ‘<name>List’. The corresponding XML becomes a parent <nameList> element with nested <name> elements for each of the list elements. This convention simplifies the whole process without being excessively onerous.

So there you are: Don’t Repeat Yourself rendering with JSON and XML. It may not be the most efficient approach computationally, but it will save a fair bit of development and maintenance time. And don’t forget that you always have the option of caching the responses.