30 Nov / 2010
Membase Java tutorial
Membase is a NOSQL database. It is designed to be a persistant
storage behind MEMCACHED – a popular in memory caching
tool. Membase is relatively new but has found solid
footing in high performance NOSQL world.
This is a quick tutorial on using Membase database from Java.
Setting up Membase
Follow the
instructions from Membase.org
Client
Interacting with Membase is like interacting Memcached. We
will be using SpyMemcached Java client. So download it from here.
Code
All the code needed to run this project is available at GitHub : https://github.com/sujee/membase-tutorial
It is an eclipse project and is ready to go.
Lets start
So here is the Java code – it writes a bunch of key, values into
Membase and reads them back.
You can run this file (MembaseTest1) from eclipse. To run from command line
sh compile.sh sh run.sh or java -cp classes/:lib/memcached-2.5.jar tutorial.MembaseTest1
Membase is running on localhost on port 11211 (default memcached port)
We start by ‘flushing’ the data-bucket, so we have a clean start.
Set
cache.set (string_key, expiration_time, object_value)
Our keys are stringified numbers. And our objects are Integer objects.
Here is a sample output:
2010-11-29 23:36:33.234 INFO net.spy.memcached.MemcachedConnection: Added {QA sa=localhost/127.0.0.1:11211, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2010-11-29 23:36:33.240 INFO net.spy.memcached.MemcachedConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@b34bed0
cache put : 0 : 0, result net.spy.memcached.internal.OperationFuture@578088c0
cache put : 1 : 1, result net.spy.memcached.internal.OperationFuture@37922221
cache put : 2 : 2, result net.spy.memcached.internal.OperationFuture@5afec107
cache put : 3 : 3, result net.spy.memcached.internal.OperationFuture@b32e13d
cache put : 4 : 4, result net.spy.memcached.internal.OperationFuture@39617189
cache put : 5 : 5, result net.spy.memcached.internal.OperationFuture@2c64f6cd
…
…
Spy Memcache client caches the operations to increase performance. So as you can see, cache.set returns a ‘OperartionFuture’ object. The key-value will be eventually persisted in Membase.
Get
Next we try to read back the same values we just wrote. We are using the same client. The output looks like this:
Cache get : 96 : 96
Cache get : 97 : 97
Cache get : 98 : 98
Cache get : 99 : 99
Time for 100 gets is 77 ms. nulls 0
We are keeping an eye out for NULLs. We shouldn’t get any at this time, and as expected our null count is zero.
The code is also profiled with timestamps to see we can how fast the operations are.
Simulating Multiple Clients
For the above example, we write / read from the same client. In real scenarios how ever, multiple clients will read/write to Membase. Lets simulate that by using two different client-connections.
So in this version we use one connection for writing and another connection for reading.
java -cp classes/:lib/memcached-2.5.jar tutorial.MembaseTest2
Cache get : 0 : 0
Cache get : 1 : 1
…….
…..
Cache get : 997 : null
Cache get : 998 : null
Cache get : 999 : null
Time for 1000 gets is 540 ms. nulls 42
One thing that is standing out is, we are seeing NULLs when we
try to read back the values we just set!
So what is going on here?
Remember, the Spy Memcache client caches operations to increase
performance. When we call ‘shutdown’ the client exits without
writing all the cached values into Membase. So these
key,values are simply lost! Not good!!
Lets fix this.
After we done with SETs, lets shutdown the client gracefully.
Here we are giving it 10 seconds to shut down. Hopefully this
will give the client a chance to write out all cached operations.
java -cp classes/:lib/memcached-2.5.jar tutorial.MembaseTest3
output:
…
…
Cache get : 998 : 998
Cache get : 999 : 999
Time for 1000 gets is 500 ms. nulls 0
no more NULLS. We get back all the values we wrote in.
So now we have successfully simulated multiple clients. One thing
to remember is, the client does caching and shutting it down gracefully
so no data is lost.
Accessing Different Buckets
So far we were accessing the default ‘data bucket’. In
databases we have different ‘tables’. In Membase we have
different data-buckets. Each data-bucket is bound to a unique
port.
Access the Membase admin UI at http://localhost:8091
click on Manage —> Data Buckets
and create another bucket. Bind it port 11212 (one above the standard port)
Now point memcache client to a different port to access
different buckets.
port = 11212
that’s it!
We can have multiple MemBase connections for multiple data-buckets in the same program.
Update @ Dec 06, 2010:
According to Matt Ingenthron (Software Engineer – Membase, Inc) pre-release version of SpyMemcache can connect to different buckets
wihtout the need to explicitly configuring port numbers.
See here : http://wiki.membase.org/display/membase/prerelease+spymemcached+vBucket

3 Comments:
By Shreyas 09 May 2011
In the single client scenario, using the same MemcachedClient for inserting and fetching will lead to misleading times when the records cross 10-15K. Closing and flushing the client object and reopening a new one, after every fetch will result in true timings
By Matthew 07 Jul 2011
Hi ,
Very neat tutorial! but when I tried running the code, I felt that the data dint get committed to the disk, rather it was the memcached that was getting used though I specified the Membase port in the memcachedClient instance. Please let me know how to confirm whether Membase was actually used (data persistence).
By admin 08 Jul 2011
@Matthew
look at Membase dashboard, it will show items in memory / disk …etc.
Also you can shutdown membase, bring it back up and try to read the values you set previously.