Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Wednesday, April 26, 2017

OpenNLP sentence detector and Oracle JDeveloper 12c

Oracle has released its chatbots application recently and I was curious about the technology behind chatbots in general. So I explored open source natural language processing framework OpenNLP and build a simple sentence detector application using my favourite IDE Oracle JDeveloper 12c.

Input/Output:
Here is my first OpenNLP project for sentence detection, for example given a string : How are you ? This is Mike, the output should detect two sentences : How are you ? and This is Mike.
By default OpenNLP does not support Bahasa language and I tried to create sample corpora for training.

For example : Apa kabar ? saya Elva , would have output : Apa kabar ? and saya Elva

Train model for Bahasa:
OpenNLP provides command line tool for training new model. To train new Bahasa sentence detector, use following command:

$ ./opennlp SentenceDetectorTrainer -model ../id-opennlp-models/id-sent.bin -lang id -data ../id-opennlp-models/id-train.sent -encoding UTF-8

This command will generate id-sent.bin as our model file and can be used in our project:

    public static void SentenceDetectIndo() throws InvalidFormatException,
                    IOException {
            String paragraph = "Apa Kabar? Saya Elva.";
            // always start with a model, a model is learned from training data
            InputStream is = new FileInputStream("id-sent.bin");
            SentenceModel model = new SentenceModel(is);
            SentenceDetectorME sdetector = new SentenceDetectorME(model);
   
            String sentences[] = sdetector.sentDetect(paragraph);
   
            System.out.println(sentences[0]);
            System.out.println(sentences[1]);
            is.close();
    }

Output:
Apa Kabar?
Saya Elva.

Conclusion:
OpenNLP is a powerful open source natural language processing framework, easy to use and allow us to train our own model using simple command line. In this project, I successfully create a simple sentence detector application, training my own Bahasa language model and use it to detect sentences in Bahasa.

Sample project file download here (sorry about big file size 57 MB, because the project includes complete required libraries and english models)


Sunday, January 1, 2017

Using ADF View Object to display huge datasets

Issue:
In some cases, we may have huge database table with million of records or more. Querying and displaying these records may slow down your web application if it is not coded efficiently.

Solution:
Using Oracle ADF View Object component, you can tune database query easily. If you need to display only certain number of records, you can use View Object tuning configuration and set number of row to display to UI. This will increase performance by reducing number of records need to be returned from database and displayed.

Example:
In ADF Model project, double click on View Object and go to General tab. You should see Tuning option like below.



Set "Only up to row number" to display "Top-N" entries in the page and avoiding database to return all the records from database.

Often you may want to use Query Optimizer Hint FIRST_ROWS that gives a hint to the database that you want to return the first rows as quick as possible rather than trying to optimize the retrieval of all records.