Thursday, August 22, 2013

All you need is Log, All you need is Log, ..., Log is all you need.

Paraphrasing Beatles's song I am starting this post telling that logging is one of the most important features that developers, sysadmins and DBA needs to successful develop, deploy, test and tune Java applications inside the RDBMS.
Inside the OJVM this feature is even more important because there is no visual feedback when you are running Java code.
Starting with 10g, I have been using this feature as is explained in my posts Getting logging entering, exiting and throwing work at Oracle JVM, which according to Google Analytics is one of most visited page, but most of tweak requires uploading new classes and configuring it with certain knowledge.
Starting with 12c, the logging feature is directly integrated into the OJVM also for PDB.
This mean that by simply loading a logging.properties file into a user's schema you can control the logging facilities for Java code running with an effective user (Oracle JVM performs specific steps to configure logging options).
For example if I have a class loaded into SCOTT's schema and this class run with an effective user SCOTT, to activate the logging level for that class and user you have to execute:
loadjava -u scott/tiger@mydb -v javavm/lib/logging.properties
if the logging.properties file includes something like:
$ cat javavm/lib/logging.properties
# Limit the message that are printed on the console to INFO and above.
java.util.logging.ConsoleHandler.level = ALL
java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter
# Scotas OLS
com.scotas.level = INFO
org.apache.level = INFO
remember that for using logging inside the OJVM SCOTT user should have SYS:java.util.logging.LoggingPermission = control, to do that connected as SYS excecute:
$ sqlplus "sys/pwd@mydb as sysdba"
SQL> exec dbms_java.grant_permission( 'SCOTT', 'SYS:java.util.logging.LoggingPermission', 'control', '');
SQL> commit;
also your application can control logging level properties during run time, above configuration are static and defined when the OJVM start the execution of the Java code for a specific DB session.
The example below is Scotas OLS web page available for controlling logging properties during run time (the old logging configuration Servlet of Apache Solr).
as is shown in the screen shot logging level for each hierarchy is defined with a default setting inherit from logging.properties file, but we can change that value to any other value using above form.
Finally where my logs goes?
Independent if you are running your Java application in a traditional RDBMS installation or in a new CDB/PDB configuration log information will goes to the .trc file associated to the session, this is at:
$ORACLE_BASE/diag/rdbms/$ORACLE_SID/$ORACLE_SID/trace
the .trc file is named using this filename convention $ORACLE_SID_jnnn_$PID.trc, for example:
$ cat orclc_j020_11761.trc
Trace file /u01/app/oracle/diag/rdbms/orclc/orclc/trace/orclc_j020_11761.trc
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/12.1.0.1.0/dbhome_1
System name: Linux
Node name: localhost
Release: 3.2.0-52-generic
Version: #78-Ubuntu SMP Fri Jul 26 16:21:44 UTC 2013
Machine: x86_64
Instance name: orclc
Redo thread mounted by this instance: 1
Oracle process number: 7
Unix process pid: 11761, image: oracle@pocho (J020)

*** 2013-08-22 11:29:41.805
*** SESSION ID:(387.581) 2013-08-22 11:29:41.805
*** CLIENT ID:() 2013-08-22 11:29:41.805
*** SERVICE NAME:(SYS$USERS) 2013-08-22 11:29:41.805
*** MODULE NAME:(DBMS_SCHEDULER) 2013-08-22 11:29:41.805
*** ACTION NAME:(SOLRSERVLETSERV#1) 2013-08-22 11:29:41.805
*** CONTAINER ID:(3) 2013-08-22 11:29:41.805
.....
*** 2013-08-22 11:52:37.581
INFO: Unset log level on 'root'.
Aug 22, 2013 11:52:37 AM com.scotas.solr.servlet.LogLevelSelection doPost
INFO: Set 'com.scotas' to INFO level.
is important that even if you have multiple PDBs, all log files will go to the trace directory of the CDB because all PDBs re-use the process infra-structure of the CDB.
I will post another blog entry about the impact of the process infra-structure of the Multitenant Environment and the OJVM applications.

Wednesday, August 21, 2013

An efficient way to do massive inserts with Oracle JDBC

  I was exchanging emails with my friend Kuassi Mensah (@kmensah) to see how to use JDBC from what is known as array DML.
  Typically known for the DBA of the form:
INSERT INTO TEST_SOURCE_BIG_A
        SELECT a.line a_line, a.name a_name, a.owner a_owner, a.text a_text, a.type a_type
           FROM TEST_SOURCE_BIG a;
  if the two tables have same structure the RDBMS realizes that and is putting together batchs between 140-170 rows each.
  This will cause that any index associated (such as Scotas OLS) with a call to the function:
FUNCTION ODCIIndexInsert (
   ODCIIndexInfo ia,
   ridlist ODCIRidList,
   newvallist varray_of_column_type,
   ODCIEnv env)
RETURN NUMBER
  where the argument takes an array ridlist with all ROWIDs.
  The point is how can I exploit that functionality from JDBC, or how to make a massive inserts efficiently from Java (very important if your app. is inserting twits or FB comments)?
  The tip is: SetExecuteBatch method of preparedStatment.
  Broadly speaking, the code should be of the form:
        PreparedStatement ps =
            conn.prepareStatement ("insert into test_source_big_a values ​​(?,?,?,?,?)");
        / / Change batch size for this statement to 500
        ((OraclePreparedStatement) ps). SetExecuteBatch (500);
        for (int i = 0; i <= 500, i + +) {
            ps.setInt (1, i);
            ps.setString (2, "name-" + i);
            ps.setString (3, "owner-" + i);
            ps.setString (4, "this is a long test using sendBatch - statement" + i);
            ps.setString (5, "type-" + i);
            ps.executeUpdate () / / JDBC this for later execution queues
        }
        ((OraclePreparedStatement) ps). SendBatch () / / JDBC sends the queued request
        conn.commit ();
   calling that piece of code generates 6 batchs of rows:
       100 + (4 x 93) + 98
   which from the point of view of the index implementation will be much more efficient because instead of enqueuing 500 messages in the AQ, just going to enqueue 6!!
   If you send more values with parameter rows in the batch (500) will go automatically sending without waiting SendBatch or commit.
   A parameter which Kuassi recommended me to consider is the SDU (session data unit), which is defined at the level of SQLNet to avoid SQLNet packet fragmentation.
  Note that not only is going to be much more efficient from the point of the insert tables but also with respect to the use of the network since it generates far fewer round-trip from the App. side and RDBMS.

Thursday, May 16, 2013

ElasticSearch Server


Continuing my previous post, second book that I was working for PacktPub as technical reviewer was "Elasticsearch Server".
A very good book for this development based on the library Lucene, with many examples and concepts in order to exploit the full potential of free text searches using ElasticSearch.
As with the book described in the previous post (Apache Solr 4 Cookbook) this book is a perfect resource used during the development of "Scotas's Push Connector" because it allows me to integrate and exploit their full potential.

Overview



  • Learn the basics of ElasticSearch like data indexing, analysis, and dynamic mapping
  • Query and filter ElasticSearch for more accurate and precise search results
  • Learn how to monitor and manage ElasticSearch clusters and troubleshoot any problems that arise

Table of Contents




  • Chapter 1: Getting Started with ElasticSearch Cluster
  • Chapter 2: Searching Your Data
  • Chapter 3: Extending Your Structure and Search
  • Chapter 4: Make Your Search Better
  • Chapter 5: Combining Indexing, Analysis, and Search
  • Chapter 6: Beyond Searching
  • Chapter 7: Administrating Your Cluster
  • Chapter 8: Dealing with Problems

Authors


Rafał Kuć

Rafał Kuć is a born team leader and software developer. Currently working as a Consultant and a Software Engineer at Sematext Inc, where he concentrates on open source technologies such as Apache Lucene and Solr, ElasticSearch, and Hadoop stack. He has more than 10 years of experience in various software branches, from banking software to e-commerce products. He is mainly focused on Java, but open to every tool and programming language that will make the achievement of his goal easier and faster. Rafał is also one of the founders of the solr.pl site, where he tries to share his knowledge and help people with their problems with Solr and Lucene. He is also a speaker for various conferences around the world such as Lucene Eurocon, Berlin Buzzwords, and ApacheCon. Rafał began his journey with Lucene in 2002 and it wasn't love at first sight. When he came back to Lucene later in 2003, he revised his thoughts about the framework and saw the potential in search technologies. Then Solr came and that was it. From then on, Rafał has concentrated on search technologies and data analysis. Right now Lucene, Solr, and ElasticSearch are his main points of interest. Rafał is also the author of Apache Solr 3.1 Cookbook and the update to it—Apache Solr 4 Cookbook—published by Packt Publishing.

Marek Rogoziński

Marek Rogoziński is a software architect and consultant with more than 10 years of experience. His specialization concerns solutions based on open source projects such as Solr and ElasticSearch. He is also the co-funder of the solr.pl site, publishing information and tutorials about the Solr and Lucene library. He currently holds the position of Chief Technology Officer in Smartupz, the vendor of the Discourse™ social collaboration software.

Conclusion


Many developers know Lucene, but do not know the product ElasticSearch, to know or want to know this book is going to enter in the product and the potential of this.




Apache Solr 4 Cookbook


During my summer I had the chance to work for Packtpub as technical reviewer of two great books.
First "Apache Solr 4 Cookbook" is a very good book to entered into the world of free-text search integrated to any development or portal with real and practical examples using the latest version of the Apache Solr search.

Overview



  • Learn how to make Apache Solr search faster, more complete, and comprehensively scalable
  • Solve performance, setup, configuration, analysis, and query problems in no time
  • Get to grips with, and master, the new exciting features of Apache Solr 4


Table of Contents


  • Chapter 1: Apache Solr Configuration
  • Chapter 2: Indexing Your Data
  • Chapter 3: Analyzing Your Text Data
  • Chapter 4: Querying Solr
  • Chapter 5: Using the Faceting Mechanism
  • Chapter 6: Improving Solr Performance
  • Chapter 7: In the Cloud
  • Chapter 8: Using Additional Solr Functionalities
  • Chapter 9: Dealing with Problems
  • Appendix: Real-life Situations

Author


Rafał Kuć is a born team leader and software developer. Currently working as a Consultant and a Software Engineer at Sematext Inc, where he concentrates on open source technologies such as Apache Lucene and Solr, ElasticSearch, and Hadoop stack. He has more than 10 years of experience in various software branches, from banking software to e-commerce products. He is mainly focused on Java, but open to every tool and programming language that will make the achievement of his goal easier and faster. Rafał is also one of the founders of the solr.pl site, where he tries to share his knowledge and help people with their problems with Solr and Lucene. He is also a speaker for various conferences around the world such as Lucene Eurocon, Berlin Buzzwords, and ApacheCon. Rafał began his journey with Lucene in 2002 and it wasn't love at first sight. When he came back to Lucene later in 2003, he revised his thoughts about the framework and saw the potential in search technologies. Then Solr came and that was it. From then on, Rafał has concentrated on search technologies and data analysis. Right now Lucene, Solr, and ElasticSearch are his main points of interest. Rafał is also the author of Apache Solr 3.1 Cookbook and the update to it—Apache Solr 4 Cookbook—published by Packt Publishing.


Conclusion

If you are about to start a development or entered into the world of free text this book is a very good investment in time and resources, practical examples really serve to acquire new concepts in a simple and practical with minimal effort.




Monday, November 28, 2011

Implementing Oracle parallel shared server process in Java inside the Database

Behind the implementation of latest LDI open source project and the OLS products there is a functionality not well know by Oracle Java database developers, I called it Parallel Shared Server process.
The idea is to have an Oracle shared server process running during the  life-time of the instance, which means a process automatically started during database startup and stopped during database shutdown.
So which functionality this process can implement?, on LDI is an RMI server, on OLS is lightweight HTTP server, but basically you can implement anything you need for example getting information from another process and fill some table, getting statistical, consuming web services, etc. etc.
Let see in some example how it works.
We will create a TEST user and creates some Java classes running a simple Hello World RMI server.
SQL> conn / as sysdba
SQL> create user test identified by test
  2  default tablespace users
  3  temporary tablespace temp
  4  quota unlimited on users;
SQL> grant connect,resource,create any job to TEST;
SQL> exec dbms_java.grant_permission( 'TEST', 'SYS:java.net.SocketPermission', 'localhost:1024-', 'listen,resolve');
SQL> exec dbms_java.grant_permission( '
TEST', 'SYS:java.net.SocketPermission', 'localhost:1024-', 'accept, resolve');
SQL> exec dbms_java.grant_permission( '
TEST', 'SYS:java.net.SocketPermission', 'localhost:1024-', 'connect, resolve');
SQL> exec dbms_java.grant_permission( '
TEST', 'SYS:java.lang.RuntimePermission', 'setContextClassLoader', '' );
The RMI interface and server implementation running on TEST user.
SQL> conn test/test
SQL> create or replace and compile java source named "mytest.Hello" as
package mytest;
import java.rmi.Remote;
import java.rmi.RemoteException;
public interface Hello extends Remote {
    String sayHello() throws RemoteException;
    int nextCount() throws RemoteException;
}
/
SQL> create or replace and compile java source named "mytest.HelloImpl" as
package mytest;
import java.rmi.Naming;
import java.rmi.RemoteException;
import java.rmi.RMISecurityManager;
import java.rmi.registry.LocateRegistry;
import java.rmi.server.UnicastRemoteObject;
public class HelloImpl extends UnicastRemoteObject implements Hello {
    static int counter = 0;
   
    public HelloImpl() throws RemoteException {
        super();
    }
    public String sayHello() {
        return "Hello World!";
    }
    public static void main(String[] args) {
        // Create and install a security manager
        if (System.getSecurityManager() == null) {
            System.setSecurityManager(new RMISecurityManager());
        }
        try {
            HelloImpl obj = new HelloImpl();
            LocateRegistry.createRegistry(1099);
            // Bind this object instance to the name "HelloServer"
            Naming.rebind("//localhost/HelloServer", obj);
            System.out.println("HelloServer bound in registry");
        } catch (Exception e) {
            System.out.println("HelloImpl err: " + e.getMessage());
            e.printStackTrace();
        }
    }
    public synchronized int nextCount() {
        return counter++;
    }
}
/
SQL> create or replace procedure HelloServ(srvName IN VARCHAR2) as LANGUAGE JAVA NAME
        'mytest.HelloImpl.main(java.lang.String [])';
/
SQL> begin
  -- Start a Cron like process (DBMS_SCHEDULER)
  DBMS_SCHEDULER.CREATE_JOB(
   job_name          =>  'HelloServJob',
   job_type          =>  'PLSQL_BLOCK',
   job_action        =>  'begin
     HelloServ(''HelloServer'');
     exception when others then
        null;
     end;',
   start_date        =>  SYSDATE,
   enabled           => false,
   auto_drop         => false);
  DBMS_SCHEDULER.SET_ATTRIBUTE_NULL (
   name           =>   'HelloServJob',
   attribute      =>   'MAX_FAILURES');
end;
/
commit;
Now we can register two database instance trigger to automatically start and stop the job.
SQL> conn / as sysdba
SQL> CREATE OR REPLACE TRIGGER start_test_srv
  AFTER STARTUP ON DATABASE
BEGIN
  -- Start a Cron like process (DBMS_SCHEDULER)
  DBMS_SCHEDULER.ENABLE('TEST.HelloServJob');
END;
/
SQL> CREATE OR REPLACE TRIGGER stop_test_srv
  BEFORE SHUTDOWN ON DATABASE
BEGIN
  -- Start a Cron like process (DBMS_SCHEDULER)
  DBMS_SCHEDULER.STOP_JOB('TEST.HelloServJob',force=>true);
EXCEPTION WHEN OTHERS THEN
  null;
END;
/
If we process to do a shutdown/startup sequence the server will up and running, also we can start the server manually by executing:
SQL> conn / as sysdba
SQL> exec DBMS_SCHEDULER.ENABLE('TEST.HelloServJob');
SQL> commit;
after doing that we can see at $ORACLE_BASE/diag/rdbms/orcl/orcl/trace a .trc file associated with the parallel shared server process which is up and running:
-bash-4.2$ cat orcl_j000_10411.trc
Trace file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_j000_10411.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11_2_0_2_0/dbhome_1
System name:    Linux
Node name:      localhost.localdomain
Release:        2.6.38.7-server-1mnb2
Version:        #1 SMP Sun May 22 22:59:25 UTC 2011
Machine:        i686
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 25
Unix process pid: 10411, image: oracle@localhost.localdomain (J000)

*** 2011-11-28 18:05:41.091
*** SESSION ID:(151.35) 2011-11-28 18:05:41.091
*** CLIENT ID:() 2011-11-28 18:05:41.091
*** SERVICE NAME:(SYS$USERS) 2011-11-28 18:05:41.091
*** MODULE NAME:(DBMS_SCHEDULER) 2011-11-28 18:05:41.091
*** ACTION NAME:(HELLOSERVJOB) 2011-11-28 18:05:41.091

HelloServer bound in registry
and this process is listening into the default RMI port 1099, we can see that using:
-bash-4.2$ netstat -anp|grep ora_j0
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 :::19189                    :::*                        LISTEN      10411/ora_j000_orcl
tcp        0      0 :::1099                     :::*                        LISTEN      10411/ora_j000_orcl 
and that's all, we can connect using an RMI client from another client session, for example:
SQL> create or replace and compile java source named "mytest.HelloClient" as
package mytest;
import java.rmi.Naming;
import java.rmi.RemoteException;
public class HelloClient {
    Hello obj = null;
    public HelloClient() {
        try {
            obj = (Hello)Naming.lookup("//localhost/HelloServer");
        } catch (Exception e) {
            System.out.println("HelloApplet exception: " + e.getMessage());
            e.printStackTrace();
        }
    }
    public String sayHello() throws RemoteException {
        return obj.sayHello();
    }
    public int nextCount() throws RemoteException {
        return obj.nextCount();
    }
    public static void main(String[] args) throws RemoteException {
        HelloClient helloClient = new HelloClient();
        System.out.println(helloClient.sayHello());
        System.out.println(helloClient.nextCount());
    }
}
/
SQL> create or replace procedure HelloClient(srvName IN VARCHAR2) as LANGUAGE JAVA NAME
'mytest.HelloClient.main(java.lang.String [])';
/
SQL> set define ?
SQL> set serverout on
SQL> exec dbms_java.set_output(32000);
SQL> exec HelloClient('HelloServer');
0
SQL> exec HelloClient('HelloServer');
1
Note that the server is state-full which means that preserve the state across calls, if we exit from above SQLPlus and connect again we will see that:

SQL> set define ?
SQL> set serverout on
SQL> exec dbms_java.set_output(32000);
SQL> exec HelloClient('HelloServer');
2
SQL> exec HelloClient('HelloServer');
3

Concluding this post I would like to remark that this parallel shared server process is running into RDBMS space and is not like starting an RMI server into the middle tier. the big difference is that all SQL access is implemented accessing directly to the RDBMS structures into the SGA because is using the internal JDBC driver.

Friday, November 25, 2011

New release of Lucene Domain Index based on 3.4.0 code

This new release of Lucene Domain Index (LDI) has been in the new SVN for a long time, but due a lot of works with the commercial version Scotas never went public in binary installation :(
Several thing happen during this time:

  • New web site (thanks a lot to Peter Wehner for the conversion from the Google docs)
  • New hosting at SF.net (now is separate SVN from the previous one CVS at DBPrism)

The change log of this version is:


  • Use latest merge policy implementation TieredMergePolicy
  • Use total RAM reported by getJavaPoolSize() when setting MaxBufferedDocs
  • Better error reporting when an Analyzer is not found.
  • Replaced execute immediate with open-fech-close functionality to avoid core dump on 10g when double check for deleted rowid
  • Included a back-port version of JUnit4 to jdk1.4 version for 10g releases
  • Added a parallel updater process, when working in OnLine mode this process do write operations on LDI structure on behalf of the AQ process
  • Delete do not longer required a write exclusive lock on index storage, now deletes are also en-queued as inserts or updates
  • Updated source to Lucene 3.4.0 code, removed some deprecated API

Download latest binary distribution at 3.4.0 directory of SF.net download area (tested with 10g/11gR2).
The addition of a new parallel shared server process is the major change which speed up a lot DML operations, I'll write in a new post on how this parallel shared server technique works.
Please report any issue during the installation or bugs at the Support Area of the project.

Monday, September 26, 2011

Long time no post


I found that my last post was on Sep. 14 2010, that's too bad.
The reason of that is the startup called Scotas I worked on that almost since December of 2010 to engineering a next level of Oracle and Lucene project integration, this is by adding the Solr stack.
Its a natural evolution, Solr provides a lot of common functionality working on top of  Lucene required for the enterprise level solution.
But the integration required a set of implementation challenges to work, once of them is the implementation of long live shared server process running in Java and which implements a new dedicated HTTP listener working as an Oracle slave process, I will explain this topic on another technical post.
Another additions included in a set of products are the push technology spread on the integration with Solr running externally, ElasticSearch, Cassandra and HBase.
Behind this technology we extended the idea of Oracle/Solr integration, the Oracle ODCI API,. which enabled a NRT (Near Real Time) synchronization of the enterprise data with NoSQL layer for example.
Near Real Time means that once you commit the changes on the RDBMS layer thy are propagated automatically to the other side practically with 0 delay, and for the two way replicator such as Solr PC and ElasticSearch deletions are in real-time avoiding the false positive hits of deleted rows.
Near Real Time synchronization is declarative, no programming effort is required and allows multiples columns of one table, or multiples columns of many tables connected by a foreign key.
Well next week is the OOW 2011, I'll Be there and I Hope You Will, Too and for anybody that are coming to the event and are interested on that technology we could meet there, just drop me an email to mochoa-at-scotas.com