30 Nov / 2010

Membase Java tutorial

Membase is a NOSQL database.  It is designed to be a persistant
storage behind MEMCACHED – a popular in memory caching
tool.    Membase is relatively new but has found solid
footing in high performance NOSQL world.

This is a quick tutorial on using Membase database from Java.

Setting up Membase

Follow the
instructions from Membase.org

Client

Interacting with Membase is like interacting Memcached.    We
will be using SpyMemcached Java client.  So download it from here.

Code

All the code needed to run this project is available at GitHub : https://github.com/sujee/membase-tutorial

It is an eclipse project and is ready to go.

Lets start

So here is the Java code – it writes a bunch of key, values into
Membase and reads them back.

You can run this file (MembaseTest1) from eclipse.  To run from command line

         sh compile.sh
         sh run.sh
             or
          java -cp classes/:lib/memcached-2.5.jar tutorial.MembaseTest1

Membase is running on localhost on port 11211 (default memcached port)

We start by ‘flushing’ the data-bucket, so we have a clean start.

Set

cache.set (string_key,   expiration_time,   object_value)

Our keys are stringified numbers.  And our objects are Integer objects.
Here is a sample output:

2010-11-29 23:36:33.234 INFO net.spy.memcached.MemcachedConnection:  Added {QA sa=localhost/127.0.0.1:11211, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2010-11-29 23:36:33.240 INFO net.spy.memcached.MemcachedConnection:  Connection state changed for sun.nio.ch.SelectionKeyImpl@b34bed0
cache put : 0 : 0,  result net.spy.memcached.internal.OperationFuture@578088c0
cache put : 1 : 1,  result net.spy.memcached.internal.OperationFuture@37922221
cache put : 2 : 2,  result net.spy.memcached.internal.OperationFuture@5afec107
cache put : 3 : 3,  result net.spy.memcached.internal.OperationFuture@b32e13d
cache put : 4 : 4,  result net.spy.memcached.internal.OperationFuture@39617189
cache put : 5 : 5,  result net.spy.memcached.internal.OperationFuture@2c64f6cd

Spy Memcache client caches the operations to increase performance.  So as you can see, cache.set  returns a ‘OperartionFuture’ object.  The  key-value will be eventually persisted in Membase.

Get

Next we try to read back the same values we just wrote.  We are using the same client.  The output looks like this:

Cache get : 96 : 96
Cache get : 97 : 97
Cache get : 98 : 98
Cache get : 99 : 99
Time for 100 gets is 77 ms.  nulls 0

We are keeping an eye out for NULLs.  We shouldn’t get any at this time, and as expected our null count is zero.
The code is also profiled with timestamps to see we can how fast the operations are.

Simulating Multiple Clients

For the above example, we write / read from the same client.  In real scenarios how ever,  multiple clients will read/write to Membase.  Lets simulate that by using two different client-connections.
So in this version we use one connection for writing and another connection for reading.

java -cp classes/:lib/memcached-2.5.jar tutorial.MembaseTest2


Cache get : 0 : 0

Cache get : 1 : 1

…….

…..

Cache get : 997 : null

Cache get : 998 : null

Cache get : 999 : null

Time for 1000 gets is 540 ms.  nulls 42
One thing that is standing out is, we are seeing  NULLs when we
try to read back the values we just set!

So what is going on here?

Remember, the Spy Memcache client caches operations to increase
performance.  When we call ‘shutdown’ the client exits without
writing all the cached values into Membase.  So these
key,values are simply lost!  Not good!!

Lets fix this.

After we done with SETs, lets shutdown the client gracefully.
Here we are giving it 10 seconds to shut down.  Hopefully this
will give the client a chance to write out all cached operations.

java -cp classes/:lib/memcached-2.5.jar tutorial.MembaseTest3

output:


Cache get : 998 : 998

Cache get : 999 : 999

Time for 1000 gets is 500 ms.  nulls 0

no more NULLS.  We get back all the values we wrote in.

So now we have successfully simulated multiple clients.  One thing
to remember is, the client does caching and shutting it down gracefully
so no data is lost.

Accessing Different Buckets

So far we were accessing the default ‘data bucket’.   In
databases we have different ‘tables’.  In Membase we have
different data-buckets.  Each data-bucket is bound to a unique
port.

Access the Membase admin UI  at   http://localhost:8091
click on  Manage —> Data Buckets
and create another bucket.  Bind it port 11212 (one above the standard port)


Now point memcache client to a different port to access
different buckets.
port = 11212

that’s it!
We can have multiple MemBase connections for multiple data-buckets in the same program.

Update @ Dec 06, 2010:

According to Matt Ingenthron (Software Engineer – Membase, Inc) pre-release version of SpyMemcache can connect to different buckets
wihtout the need to explicitly configuring port numbers.
See here : http://wiki.membase.org/display/membase/prerelease+spymemcached+vBucket

Sujee Maniyam
Sujee is a founder, principal at Elephant Scale where he provides consulting and training on Big Data technologies

3 Comments:


  • By Shreyas 09 May 2011

    In the single client scenario, using the same MemcachedClient for inserting and fetching will lead to misleading times when the records cross 10-15K. Closing and flushing the client object and reopening a new one, after every fetch will result in true timings

  • By Matthew 07 Jul 2011

    Hi ,

    Very neat tutorial! but when I tried running the code, I felt that the data dint get committed to the disk, rather it was the memcached that was getting used though I specified the Membase port in the memcachedClient instance. Please let me know how to confirm whether Membase was actually used (data persistence).

  • By admin 08 Jul 2011

    @Matthew
    look at Membase dashboard, it will show items in memory / disk …etc.

    Also you can shutdown membase, bring it back up and try to read the values you set previously.

Leave a Reply



Copyright 2015 Sujee Maniyam (