Performance Zone is brought to you in partnership with:

Max De Marzi, is a seasoned web developer. He started building websites in 1996 and has worked with Ruby on Rails since 2006. The web forced Max to wear many hats and master a wide range of technologies. He can be a system admin, database developer, graphic designer, back-end engineer and data scientist in the course of one afternoon. Max is a graph database enthusiast. He built the Neography Ruby Gem, a rest api wrapper to the Neo4j Graph Database. He is addicted to learning new things, loves a challenge and finding pragmatic solutions. Max is very easy to work with, focuses under pressure and has the patience of a rock. Max is a DZone MVB and is not an employee of DZone and has posted 60 posts at DZone. You can read more from them at their website. View Full User Profile

Scaling Writes

09.06.2013
| 4224 views |
  • submit to reddit

Most of the applications using Neo4j are read-heavy and scale by getting more powerful servers or adding additional instances to the HA cluster. Writes, however, can be a little bit trickier. Before embarking on any of the following strategies it is best that the server is tuned. See the Linux Performance Guide for details. One strategy we’ve seen already is splitting the reads and writes to the cluster, so the writes only go to the Master. The brave can even change the push factor to zero and set a pull interval only in neo4j/conf/neo4j.properties:

ha.tx_push_factor = 0
ha.pull_interval = 5s

By changing the default of 1 to 0, the slaves will only be updated when they pull from the master every five seconds.

Another strategy is to accumulate writes and write periodically. Let’s take a closer look at this. I’m going to build a very simple performance test suite that points to a Ruby application that will send requests to Neo4j. I’ll be using Gatling, which you may remember from last Valentine’s day. We’re going to create two tests to start out with. One will make a POST request to http://localhost:9292/node, which will create one node at a time, and the other will send a POST request to http://localhost:9292/nodes, which will accumulate them first and then write.

class CreateNode extends Simulation {
  val httpConf = httpConfig
    .baseURL("http://localhost:9292")
 
  val scn = scenario("Create Nodes")
    .repeat(5000) {
    exec(
      http("create node")
        .post("/node")
        .check(status.is(200)))
      .pause(0 milliseconds, 1 milliseconds)
  }
 
  setUp(
    scn.users(10).protocolConfig(httpConf)
  )
}

I’ll skip the nodes code, but it's almost identical. The Ruby application that listens for these requests looks like:

post '/node' do
  $neo.create_node
  'node created'
end 
post '/nodes' do
  $queue << [:create_node, {}]
    if $queue.size >= 100
      $neo.batch *$queue
      $queue = []
    end
  'nodes created'
end

The first takes the request and immediately sends it to Neo4j. The second accumulates the writes into a queue and once that queue fills up to 100 it writes the requests into a single BATCH transaction. One of the beauties of the BATCH rest endpoint is that you can send it nodes to be created, relationships to be updated, cypher queries, whatever you want.

Let’s take a look at the performance numbers from Gatling. First, one node at a time:

node_writes

Our mean is 20 ms and we are doing 460 requests per second. Next, 100 nodes at a time:

nodes_writes

We can see our mean latency decreased by three times to six ms and our requests per second increased by three times to 1,436. That’s pretty significant. OK, what if we commit every 5,000 requests instead?

large_nodes_writes

We are able to get another 10% in requests per second, but our max response time jumped quite significantly. If we think about our application, this means most users will get fast response times, and one user every 5,000 requests will sit there hating life.

accumulated_writes

So let’s take a look at another way to handle this. We’re going to completely decouple our application from writing to Neo4j, and instead write to …

rabbitmq

When we receive the request to make a new node, we’ll publish it to a RabbitMQ Exchange to be handled later.

post '/evented_nodes' do
  message = [:create_node, {}]
  $exchange.publish(message.to_msgpack)
  'node created'
end 

We can even reuse our accumulated strategy here:

post '/evented_accumulated_nodes' do
  $queue << [:create_node, {}]
    if $queue.size >= 100
      $exchange.publish($queue.to_msgpack)
      $queue = []
    end
  'nodes created'
end

A service is subscribed to the queue of the exchange and grabs these messages.

queue.bind(exchange).subscribe(:ack => true, :block => true) do |delivery_info, metadata, payload|
  message = MessagePack.unpack(payload)
  $last = delivery_info.delivery_tag
  $messages << message
  $cnt += 1
     
  if $cnt >= MAX_BUFFER
    WriterConsumer.process_messages
  end
end

… and a consumer processes them:

def self.process_messages
  if !$messages.empty? || self.times_up?
    batch_results = $neo.batch *$messages
     
    # Acknowledge message
    $channel.acknowledge($last, true)
    self.reset_variables
  end
end

I took a screen capture of RabbitMQ hard at work, processing about 1,100 messages per second:

rabbitmq_queues

So what does the performance look like on these? First, the single-evented node test:

evented_nodes

The mean latency is eight ms, and the max latency is 40 ms, both of which look great, but our requests per second went down to 1,036. How about the accumulated evented node test:

evented_accumulated_writes

Now we’re cooking. Our mean and max latencies are very small and our requests per second jumped to 1,747.

If you take the time to read the Ruby code, you may notice that I’m accumulating writes in the writer service as well, but besides waiting until I have a certain number of messages, I also have a timer that is triggering the writes. You can use the same idea in your web app to commit every X writes or every Y time to handle bursts of writes as well as slow periods.

#setup timers
$timers.every(TIME_TO_WAIT) { WriterConsumer.process_messages }
 
# Start Timers
timer_thread = Thread.new do
  loop do
    loop { $timers.wait }
  end
end
timer_thread.abort_on_exception = true

Finding the right latency and throughput numbers for your application is important, so experiment with what make sense to you. Also, make sure you run tests on the hardware you will be running in production. Your laptop numbers will be completely different. The implementation of the accumulated writes technique I am using will not survive a web server crash, so in your production application consider using a durable form of storage like Redis, Riak or Hazelcast instead of an in-memory Ruby array. 



Published at DZone with permission of Max De Marzi, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)