I eliminated the "Filter" usage for making powerfult text readers. because it was too complex for the beginner's to understand and clumsy for the rest. so i made new methods with reasonable names instead. before reading all lines from a text lines while applying trim, ignoring empty lines and allowing only a matching regular expression was like this:
List list = new SimpleFileReader.Builder("test/multi_line_text_file.txt")
.filters(StringFilters.PASS_ONLY_TEXT, StringFilters.newPrefixFilter("^[^#]")
.trim()
.build()
.asStringList();
Now it is like this:
List list = new SimpleFileReader.Builder("test/multi_line_text_file.txt")
.allowMatchingRegexp("^[^#]")
.ignoreWhiteSpaceLines()
.trim()
.build()
.asStringList();
to me this is cleaner, and user do not need to know yet another class.
Now there is a new concept called Template, which actually contains the properties of a SimplefileReader, so i can use it to generate other SimpleFileReader objcets using it. This is a tiny bit advanced use, i doubt it will be used frequently.
SimpleFileReader.Template template = new SimpleFileReader.Template()
.allowMatchingRegexp("^[^#]")
.ignoreWhiteSpaceLines()
.trim();
List files = Files.crawlDirectory(new File("blah"));
for (File file : files) {
SimpleFileReader sr = template.generateReader(file);
// .... read it , do something..
}
I eliminated a bug in Strings.insertFrom... methods. Now you can add white spaces as well.
There is a new class, called "CountingSet" it is not a real "Set" but what it does is a frequent necessity for me. it basically counts the elements that you add. For example:
CountingSet histogram = new CountingSet();
histogram.add("Apple", "Pear", "Plum", "Apple", "Apple", "Grape", "Pear");
for (String s : histogram)
out.println(s + " count:" + histogram.getCount(s));
will give you
Pear count:2
Apple count:3
Plum count:1
Grape count:1
Also you can sort the items by frequency or with a Comparator.
for(String s : histogram.getSortedList())
out.println(s + " count:" + histogram.getCount(s));
Apple count:3
Pear count:2
Plum count:1
Grape count:1
CountingSet is not remarkably fast or efficent because it uses a Map inside. i may try optimizing it for speed and memory efficiency later. There is a MultiFileReader class, but it is not yet public. i may have second thoughts about it.
You can download the version 0.5 from http://code.google.com/p/jmate/downloads/list
Changes can be seen from here : http://code.google.com/p/jmate/source/list