DevOps Zone is brought to you in partnership with:

Istvan Szegedi is an IT Technical Architect at Vodafone UK. He has been working at Hewlett-Packard, Nokia Networks, Google, Morgan Stanley and Vodafone. He holds certificates such as Sun Certified System Administrator, Sun Certified Java Programmer, Sun Certified Web Component Developer, Salesforce.com Certified Force.com Developer, TOGAF Certified Enterprise Architect. As a big fan of mobile and cloud computing, he likes to believe that these technologies will eventually push aside the desktop/client-server architecture Istvan is a DZone MVB and is not an employee of DZone and has posted 38 posts at DZone. You can read more from them at their website. View Full User Profile

NuoDB 2.0: A Distributed Database

01.21.2014
| 7608 views |
  • submit to reddit

Introduction

NuoDB 2.0 was recently released and it delivers some major new geo-distributed management features, along with database automation and SQL and Java stored procedures. The geo-distribution capabilities enable users to distribute a single logical database across multiple cloud regions or data centers. The database automation functionality offers easy provisioning for standard configurations based on templates, while stored procedures bring the capability to put more complex logic into the database layer, in a similar way as it is done in traditional RDBMS systems such as Oracle or MySQL.

NuoDB Geo-Distribution

The new version of NuoDB provides support for geographically distributed databases, each of which can be a single, ACID-compliant logical database running across multiple data centers located in separate geographic regions. This ability makes it possible to achieve 24/7 application availability, and support applications that require low-latency access for all their users across different geographic regions.  One example would be a financial institution with offices around the world or, as NuoDB referenced during the 2.0 launch event, a VoIP provider with users all over the world.

In order to minimize latency, the local client is going to talk to the local transaction engine geographically close to the client. The on-demand caching mechanism is then going to ensure that the clients within the same physical region are accessing the data with low latency; there will be no need for coordination between different regions. For commits, NuoDB supports different isolation and durability modes that ensure that the users always see consistent durable data.


In NuoDB 2.0 the concept of regions was introduced. Regions can represent a geographic location of the database, for instance London, Tokyo or New York in our example above. In order to provision a region we need to define it for the NuoDB Brokers and Agents in the default.properties file using the region attribute.  Later on in this article we will show a configuration for regions.

The challenge with the geographically distributed setup is how to ensure data consistency across the regions.  As mentioned above, NuoDB can support different kinds of commit protocols.  In its simplest form (called local commit in NuoDB terms) the Transaction Engine (TE) sends the commit message to the Storage Manager (SM) asynchronously and then it sends back an acknowledgment to the client. This is the fastest operation but the least safe; it does not ensure durability. 

Figure-1: Local commit protocol

The most reliable commit protocol is called Region:N commit. If you want to guarantee data durability across the regions then you should set up your TEs and SMs using Region:N with Journalling configuration (by using the -- commit  region:n  and -- journal enable flags in nuodb.config). The drawback of this type of the commit protocols is that this is the most expensive commit protocol.


Figure-2: Region:N with journaling commit protocol

NuoDB Automation

As with any distributed system or other database such as , MySQL, SQL Server, PostgreSQL or Oracle to provision a standard NuoDB database configuration can be laborious; with NuoDB, the system/database administrator needs to start an agent on all the machines that are to be provisioned to run NuoDB, then he needs to launch a management console (either command line or web-based), where he can initiate the required storage manager and transaction engine processes. In case of a basic, high-availability configuration it requires at least two agents, two storage managers and two transaction engines.

In NuoDB 2.0 there is an automation functionality that makes NuoDB an excellent choice to fulfill easy-to-deploy distributed database requirements.  Besides helping the initial deployments, automation is also a great choice to support scale-out operations or to enforce minimal process requirements.  Automation is host- and region-aware such that databases can expand automatically when a host joins an existing region in a NuoDB Domain, or when a new region joins the domain. The solution is based on the REST API and a predefined set of templates. NuoDB 2.0/2.0.1/2.0.2 installs the REST service and the automation console but disables it by default. 

In our example we will use the gzipped tar installation package to install NuoDB 2.0.1.  (Of course, you can use 2.0.2 in case you have already downloaded and installed the latest version of the product.)  We also need to create the appropriate production-archive directory (under /var/opt/nuodb in this case).

In order to activate automation, we need to run a script called run-auto-console.sh with –bootstrap flag. Note that this needs to run only once per domain and on only one of the Broker hosts.

# Untar nuodb 2.0.1 and create a symbolic link called nuodb
$ cd ~
$ tar xvzf nuodb-2.0.1.linux.x64.tar.gz
$ ln -s nuodb-2.0.1.219.linux.x86_64/ nuodb

# Create the production archive directory under /var/opt/nuodb
$ sudo mkdir /var/opt/nuodb
$ sudo chmod nuodb:nuodb /var/opt/nuodb
$ mkdir /var/opt/nuodb/production-archives

$ ./run-auto-console.sh –bootstrap

As a result, the script will edit the Broker default.properties file to contain the required automation properties (enableAutomation and enableAutomationBootstrap), restart the NuoDB Broker and start up the Automation Console:

$ tail -6 default.properties
# to TEs in any region in a round-robin pattern.
#balancer =

# Updates made by run-auto-console for supporting automatic administration
enableAutomation = true
enableAutomationBootstrap = true

NuoDB Agents and Brokers internally use an admin database (“nuodb_system”) to coordinate enforcement of database requirements based on domain events (such as host joining a domain) as well as storing database-processing metrics. This database is created as part of the bootstrap operation. The database cannot yet be managed via the Automation Console, but can be seen from the Web admin console (http://localhost:8080/console.html) or from the administrative command line interface.  

By default, NuoDB Automation Console is accessible via port 8888, so we need to fire up a browser with http://<hostname>:8888  that displays the automation console:


Once we have logged in (the default username is domain, password is bird), we can then navigate to Templates menu.


We can see four options in NuoDB 2.0.1: Single Host, Minimally Redundant, Multi Host and Geo-distributed.

The definition of the templates is as follows:

  • Single Host: This template starts one Storage Manager and one Transaction Engine on a host that you specify. These processes will be fixed to that host and will not be moved if that host goes offline.
  • Minimally Redundant: This template starts two Storage Managers and two Transaction Engines. Each host will have at most one SM and one TE, so this template requires a minimum of two hosts, but may use up to four. A process will be restarted on a new host if one goes offline and another is available.
  • Multi-Host: This template starts up to two Storage Managers and as many Transaction Engines as possible. Each host will have at most one SM and one TE, so this template can be used with any number of hosts. A process will be restarted on a new host if one goes offline and another is available.
  • Geo-distributed: This template starts one Storage Manager and as many Transaction Engines as possible in every region. Each host will have at most one SM and one TE, so this template can be used with any number of hosts. A process will be restarted on a new host if one goes offline and another is available.

After clicking on the Launch DB Instance button, a configuration menu will be shown. We can then define the database name, the DBA username and password and select one of the available templates. In this particular example we use Single Host. 


The Single Host template will define a NuoDB server (storage manager and transaction engine) on one machine (in this particular example it is localhost.localdomain with DEFAULT region running  on a single laptop).

Once we submit the request, the REST API will interpret it and the defined configuration will be launched.  After a couple of seconds our testdb will appear with RUNNING status.

Basic management functionalities such as delete, stop, and edit are available via the UI. By clicking on the database name (testdb in this example), we will see that one storage manager and one transaction engine have been started on the defined host:

Scaling out with NuoDB Automation Console

We can also use the automation console to support scale-out functionality. The example below will demonstrate how to set up a minimally redundant solution using AWS EC2 nodes. Prior to using the automation console, we need to install NuoDB on the two machines. We then need to configure the peers in default.properties file to point to the other Broker.

On SERVER ip-10-229-57-25:
#edit default.properties
$ vi default.properties
broker=true
peer=10.72.165.209

$ bin/run-auto-console --bootstrap
# Updates made by run-auto-console for supporting automatic administration
enableAutomation = true
enableAutomationBootstrap = true


On SERVER ip-10-72-165-209:
#edit default.properties
$ vi default.properties
broker=true
peer=10.229.57.25

$ sudo etc/nuoagent start

After that we need to run the script run-auto-console.sh with –bootstrap on one and on only one machine in the domain to create nuodb_system database. Then we can login to the automation console using the appropriate AWS URL and credentials (http://EC2-HOSTNAME.compute.amazonaws.com:8888/, default username and password are domain and bird).

Now we can start up our Single Host node on an AWS EC2 machine using the same Single Host template as before. This time the host machine is ip-10-229-57-25 with DEFAULT region.

Once our testdb is up and running, we can click on the Edit Database menu and choose the Minimally Redundant template as shown below:

 

After clicking on the Submit button the server will start running another Transaction Engine (TE) and Storage Manager (SM) on host ip-10-229-57-25, thus we will have 1 region, 2 hosts and 4 processes running.

 

The host configuration for our database will be as follows:

Geo-distributed configuration with NuoDB Automation Console

The NuoDB 2.0.x Automation Console can also be used to configure geo-distributed databases. The steps in the example below are similar as in the previous example; we need to install NuoDB on the required servers then configure the regions in the default.properties file for the Broker and Agents. For Brokers we also need to define the peers from other regions, in a way similar to the minimal redundancy configuration:

On SERVER ip-10-229-57-25:
#edit default.properties
$ vi default.properties
broker=true
peer=10.72.165.209
region=REGION_A

$ bin/run-auto-console --bootstrap
# Updates made by run-auto-console for supporting automatic administration
enableAutomation = true
enableAutomationBootstrap = true


On SERVER ip-10-72-165-209:
#edit default.properties
$ vi default.properties
Broker=true
peer=10.229.57.25
region=REGION_B

$ sudo etc/nuoagent start

Now we can launch a database with the Geo-distributed template:


As a result we will have two separate regions (REGION_A and REGION_B) running on one host each:

As we could see, NuoDB automation offers an easy-to-use database-provisioning interface for  administrators; they can use a web console to start up different database configurations with the help of just a few clicks. The provided templates can also be modified to tailor them to the requirements.

NuoDB Stored Procedures

The other great feature in the newest NuoDB release is the support for stored procedures. Traditionally, stored procedures in the RDBMS world offer a way to support complex logic within the database and thus avoid costly communications between the application layer and the database engine that could otherwise require a vast amount of data to be transferred between them in order to support similar functionalities. The closer we put the logic to the database, potentially the more performant application we could build.

NuoDB syntax for creating stored procedures is similar to the traditional SQL syntax. Below is an example. First we create a Stock table:

$ cd ~/nuodb
$ bin/nuosql testdb –user testdba –password testdba

SQL> 
   create table stock (
      id             Integer not NULL generated always as identity primary key,
      stocksymbol    String,
      stocksate      Date,
      stockopen      Decimal(8,2),
      stockhigh      Decimal(8,2),
      stocklow       Decimal(8,2),
      stockvlose     Decimal(8,2),
      stockvolume    Integer,
      stockadjclose  Decimal(8,2)
   );

Then we can load a stock data file into this Stock table using nuoloader:

$ bin/nuoloader --schema user --user testdba --password testdba --import "/home/ubuntu/google.csv",skip --to "insert into stock values(default,?,?,?,?,?,?,?,?)" testdb

Now we can define a stored procedure that will return the minimum, maximum and average stock volumes within a given period of time (defined by start date and end date parameters):

$ bin/nuosql db –user testdba –password testdba

SQL> 
SQL> set delimiter |
SQL> create procedure stockvols(IN symbol string, IN startdate date, IN enddate date, OUT minvol INTEGER, OUT maxvol INTEGER, OUT avgvol INTEGER)
  as
     minvol, maxvol, avgvol = 
         (select min(stockvolume), max(stockvolume), avg(stockvolume) 
         from stock 
         where stocksymbol = symbol and 
                          stockdate >= startdate and 
                          stockdate <= enddate);
  end_procedure
|
set delimiter ;

In order to validate our stored procedure we can define a few test procedures:

SQL> set delimiter |
SQL> create procedure test()
as
   var x INTEGER, y INTEGER, z INTEGER;
   call stockvols('GOOG', date('2010-10-11'), date('2010-11-11'), x, y, z);
   throw x;
end_procedure


SQL> create procedure test2()
as
   var x INTEGER, y INTEGER, z INTEGER;
   call stockvols('GOOG', date('2010-10-11'), date('2010-11-11'), x, y, z);
   throw y;
end_procedure
|
set delimiter ;


# TEST SQL SELECT

SQL> select min(stockvolume), max(stockvolume), avg(stockvolume) from stock where stocksymbol = 'GOOG'                                                       
and stockdate >= date('2010-10-11') and stockdate <= date('2010-11-11');

   MIN     MAX     AVG
 ------- ------- -------

 1704700 7212900 3027310

# TEST RESULT
SQL> call test();
1047000


SQL> call test2();
7212900

NuoDB Stored Procedures invoked from a Java application

If we want to execute the stored procedure that we just defined above from a Java application then we can use the following code example:

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.CallableStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.Properties;
import java.text.SimpleDateFormat;
import java.text.ParseException;
import java.util.Date;


public class NuoDBStoredProc {
      public static void main(String args[]) {
     
           ExecStoredProc storedProc = new ExecStoredProc();
           try {
                Connection conn = storedProc.connect("testdba", "testdba", "testdb");
                // prepare the EXECUTE statement
                CallableStatement stmt = 
                    conn.prepareCall("EXECUTE STOCKVOLS(?, ?, ?, ?, ?, ?)");
              
               storedProc.exec(stmt);
               if (conn != null ) conn.close();
           }
           catch (Exception e)  {
                System.out.println("Exception: " + e.getMessage());
           }
      }
}


class ExecStoredProc {
      /** The driver class provided by NuoDB. */
      public static final String DRIVER_CLASS =
           "com.nuodb.jdbc.Driver";

      /** The base URL for connecting to a local database server. */
      public static final String DATABASE_URL =
           "jdbc:com.nuodb://localhost/";

      public Connection connect(String user, String password, String dbName) throws SQLException, ClassNotFoundException {
            //Class.forName(DRIVER_CLASS);
            Class.forName("com.nuodb.jdbc.Driver");
 
            Properties properties = new Properties();
            properties.put("schema", "USER");
            properties.put("user", user);
            properties.put("password", password);
            
            Connection conn = 
                DriverManager.getConnection(DATABASE_URL + dbName, properties);
            return  conn;
      }

      public void exec(CallableStatement stmt)  throws SQLException, ParseException {
           // set a value for the first IN parameter
           stmt.setString(1, "GOOG");
           // set a value for the second IN parameter
           SimpleDateFormat formatter = new SimpleDateFormat("yyyy-MM-dd");
           Date parsedDate = (Date) formatter.parse("2010-10-11");
           java.sql.Date sqldate = new java.sql.Date(parsedDate.getTime());
           stmt.setDate(2, sqldate);
           // set a value for the third IN parameter
           parsedDate = (Date) formatter.parse("2010-11-11");
           sqldate =  new java.sql.Date(parsedDate.getTime());
           stmt.setDate(3, sqldate);
  
 
           // register the OUT parameter to retrieve min volume
           stmt.registerOutParameter(4, java.sql.Types.INTEGER);
           // register the OUT parameter to retrieve max volume
           stmt.registerOutParameter(5, java.sql.Types.INTEGER);
           // register the OUT parameter to retrieve  average volume
           stmt.registerOutParameter(6, java.sql.Types.INTEGER);

           ResultSet resultSet = stmt.executeQuery();

           int minvol = stmt.getInt(4);
           int maxvol = stmt.getInt(5);
           int avgvol = stmt.getInt(6);

           System.out.println("minvol: " + minvol);
           System.out.println("maxvol: " + maxvol);
           System.out.println("avgvol: " + avgvol);
           stmt.close();
      }

}

The notion is that we need to define a CallableStatement for NuoDB that will execute the required stored procedures. We can specify IN, INOUT or OUT parameters or we can also return a ResultSet. The database connection is based on NuoDB’s JDBC driver provided in the standard NuoDB installation package.

We can then just create a very basic build script to compile and run the application:

$ cat build_script.sh
#!/bin/sh
javac NuoDBStoredProc.java
java -classpath .:../jar/nuodbjdbc.jar NuoDBStoredProc

$./build-run.sh
minvol: 1704700
maxvol: 7212900
avgvol: 3027310

NuoDB Java Stored Procedures

NuoDB 2.0.1 also supports external stored procedures written in Java. (Note that the latest release, NuoDB 2.0.2, also adds .Net Support). Stored procedures written in SQL can still provide the highest performance but Java stored procedures can offer a rich set of capabilities including using 3rd party libraries. The previous SQL stored procedure example can be rewritten as follows using external Java stored procedures:

$ vi NuoJavaProc.java
import java.sql.*;

class NuoJavaProc {

    static void  stockvols(Connection conn, String symbol, Date startdate, Date enddate, int[] minvol, int[] maxvol, int[] avgvol) throws SQLException
    {
         String query = "select min(stockvolume), max(stockvolume), avg(stockvolume) from stock where stocksymbol = ? and stockdate >= ? and stockdate <= ?";

         PreparedStatement stockvolume = conn.prepareStatement(query);
         stockvolume.setString(1, symbol);
         stockvolume.setDate(2, startdate);
         stockvolume.setDate(3, enddate);
         ResultSet resultSet = stockvolume.executeQuery();
         if ( resultSet.next() ) {
              minvol[0] = resultSet.getInt(1); 
              maxvol[0] = resultSet.getInt(2); 
              avgvol[0] = resultSet.getInt(3); 
         }
    } 
}

Then we need to compile our code and create a .jar file as follows:

$ javac NuoJavaProc.java 
$ jar cf nuodb-jext.jar NuoJavaProc.class

In order to deploy our Java stored procedure we need to execute the following statements:

$ bin/nuosql testdb --user testdba --password testdba
SQL> create javaclass jext from 'nuodb-jext.jar';
SQL> create procedure stock_vols(IN symbol STRING, IN startdate DATE, IN enddate DATE, OUT minvol INTEGER, OUT maxvol INTEGER, OUT avgvol INTEGER) language java external 'jext:NuoJavaProc.stockvols';
SQL> show procedures;
 
    Procedures in schema USER
        STOCKVOLS
        STOCK_VOLS
        TEST
        TEST2

Now we are ready to test our external Java stored procedure:

SQL> create procedure test3()
   > as
   > var x INTEGER, y INTEGER, z INTEGER;
   > call stock_vols('GOOG', date('2010-10-11'), date('2010-11-11'), x, y, z);
   > throw x;
   > end_procedure
   > |

SQL> call test3();
1765700

Conclusion

The new version of NuoDB is moving to a very exciting direction where this distributed SQL database solution can make a real difference. Geo-distribution can offer higher availability for businesses by supporting databases running across multiple regional data centers and can offer applications better performance by serving database queries from a geographically close data center. The automation functionality offers an easy, cloud-friendly provisioning interface to automate database creation based on templates, while NuoDB SQL and Java stored procedures provide an excellent choice for developers to implement their logic within the database and avoid unnecessary data transfer between the application layer and the database engine.



Published at DZone with permission of Istvan Szegedi, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags: