Since there isn't much sample code out there that demonstrates how you can use Java libraries, I thought I would post some!
This code make use of LingPipe, a java library for Natural Language Processing.. This code splits a piece of text into sentences..
To install, download LingPipe, drop the .jar files into java->lib in the plug-ins folder of Illustrator..
The code I converted to javascript: SentenceChunkerDemo.java
importPackage(Packages.com.aliasi.sentences, Packages.com.aliasi.tokenizer, Packages.com.aliasi.chunk); var text= new java.lang.String("This text is a test text. It's function is to be tested. What do you think of that, mr. test text? "I don't mind." said the test text."); var ca= text.toCharArray(); var TokenizerFactory=new IndoEuropeanTokenizerFactory; var SentenceModel = new MedlineSentenceModel; var SentenceChunka = new SentenceChunker(TokenizerFactory,SentenceModel) var chunking = SentenceChunka.chunk(ca,0,text.length()); var sentences=chunking.chunkSet(); var slice = chunking.charSequence().toString(); var i=1; for(var it = sentences.iterator(); it.hasNext(); ){ var sentence = it.next(); var start = sentence.start(); var end = sentence.end(); print("SENTENCE "+(i++)+":"); print(slice.substring(start,end)) }
This script produces:
SENTENCE 1:
This text is a test text.
SENTENCE 2:
It's function is to be tested.
SENTENCE 3:
What do you think of that, mr. test text?
SENTENCE 4:
"I don't mind." said the test text.