Jmate is now Jcaki


I have changed the name of jmate to jcaki. caki, ( actually "çakı" ) means small knife in Turkish. Now version is V1.0 Alpha

Jmate 0.6 is out. Further simplification and reduction.

Jmate 0.6 is out. this release breaks backwards compatibility (sorry.. still pre-alpha) and trims a lot of code. Here are some changes
- Collections related classes are out. Use Google Collections instead.
- SimpleFileReader and SimpleFileWriter are now SimpleTextReader and SimpleTextWriter. Some convenient factory methods are added to both classes. Classes now accepts input and output streams.
- KeyValueReader class is added.
- Bytes class is added for many byte functions (byte-int-short conversions with endianness support)
- New handy String methods are added related with substring. Please see the wiki page.
- More unit tests

You can download from here . Check the wiki page and change details for further details.

Visual Studio, Flash CS4 code editing support sucks.

People bash Java.. But there is one thing none of the languages-tools even come close it's IDE and editor support. you understand it once you sit in front of either Adobe CS4 (400-700$) for actual coding or Visual Studio ( Pro version 800$, 2000$ with MSDN) and compare it to Netbeans, Eclipse or my favorite Intellij IDEA. what a joke.

jmate 0.5 simplification, CountingSet

Ok, i have made the Jmate 0.5 release. in this release, i made a decision and simplfy some usages of SimpleFileReader. Actually i still have a feeling that this class may need to be dedicated to the text files only, so i may have a drastic name change later. But here are the changes i made
I eliminated the "Filter" usage for making powerfult text readers. because it was too complex for the beginner's to understand and clumsy for the rest. so i made new methods with reasonable names instead. before reading all lines from a text lines while applying trim, ignoring empty lines and allowing only a matching regular expression was like this:

List list = new SimpleFileReader.Builder("test/multi_line_text_file.txt")
.filters(StringFilters.PASS_ONLY_TEXT, StringFilters.newPrefixFilter("^[^#]")
.trim()
.build()
.asStringList();

Now it is like this:

List list = new SimpleFileReader.Builder("test/multi_line_text_file.txt")
.allowMatchingRegexp("^[^#]")
.ignoreWhiteSpaceLines()
.trim()
.build()
.asStringList();

to me this is cleaner, and user do not need to know yet another class.

Now there is a new concept called Template, which actually contains the properties of a SimplefileReader, so i can use it to generate other SimpleFileReader objcets using it. This is a tiny bit advanced use, i doubt it will be used frequently.

SimpleFileReader.Template template = new SimpleFileReader.Template()
.allowMatchingRegexp("^[^#]")
.ignoreWhiteSpaceLines()
.trim();

List files = Files.crawlDirectory(new File("blah"));
for (File file : files) {
SimpleFileReader sr = template.generateReader(file);
// .... read it , do something..
}

I eliminated a bug in Strings.insertFrom... methods. Now you can add white spaces as well.

There is a new class, called "CountingSet" it is not a real "Set" but what it does is a frequent necessity for me. it basically counts the elements that you add. For example:

CountingSet histogram = new CountingSet();
histogram.add("Apple", "Pear", "Plum", "Apple", "Apple", "Grape", "Pear");
for (String s : histogram)
out.println(s + " count:" + histogram.getCount(s));

will give you

Pear count:2
Apple count:3
Plum count:1
Grape count:1

Also you can sort the items by frequency or with a Comparator.

for(String s : histogram.getSortedList())
out.println(s + " count:" + histogram.getCount(s));

Apple count:3
Pear count:2
Plum count:1
Grape count:1

CountingSet is not remarkably fast or efficent because it uses a Map inside. i may try optimizing it for speed and memory efficiency later. There is a MultiFileReader class, but it is not yet public. i may have second thoughts about it.

You can download the version 0.5 from http://code.google.com/p/jmate/downloads/list
Changes can be seen from here : http://code.google.com/p/jmate/source/list

jmate 0.4 is out

So i decided to release 0.4 version of jmate. it has some minor additions in it, nothing serious. You can download it here: http://code.google.com/p/jmate/ . Changes can be seen here: http://code.google.com/p/jmate/source/list (i am too lazy to write a changelog now.). i removed some more classes. it is still around 70kb. i used the library in another small project of ours tr-spell, you can check how it reduces the code amount when dealing with files especially . Such as   WordListGenerator class. 

Helpers to the rescue

Whenever a scripting language wants to make a good appearance, they compare themselves with Java with deliberately verbose code snippets. Reading a file, some regular expression tricks, or even a Hello World application.

Java indeed is not a new language and not really suitable for some scripting tasks by default. However, sometimes for such tasks (or for any task actually) you don't have time to learn a script language, it may be handy to use helper libraries. Actually that is the spirit of Java, that if something can be reused, make it a small library instead of doing everything with core language or libraries over and over.

There are several good helper libraries available. Like apache commons, or Google collections. but commons suffer from age and fragmentation. Google collections is modern but may not be suitable for small applications. So i decided to mix and match (In other words I stole code form those libs), and write a little bit by myself for some of our applications. My aim is to keep the library size always less than 100kb. it may have a lot of flaws, bu so far it worked for me. i call the project jmate, project code is here..

Nothing really fancy, but as a result you can write such code now:

Read a UTF-8 file as a String
String content = new SimpleFileReader("foo.txt", "utf-8").asString();
Read a file as a String List
List<String> lst = new SimpleFileReader("foo.txt").asStringList();

Read a utf-8 file as a string list, but skip the non-text lines. trim the lines. Regular expression filters also can be built.
List<String> lst = new SimpleFileReader.Builder("foo.txt")
.encoding(
"utf-8")
.filters(StringFilters.PASS_ONLY_TEXT
)
.trim()
.build()
.asStringList();

Iterate through a text file's lines.
LineIterator li = new SimpleFileReader("foo.txt").getLineIterator();
while (li.hasNext())
out.println(li.next());
IOs.
closeSilently(li);

Get a text file's lines in an enhanced for loop
for (String s : new SimpleFileReader("foo.txt").getIterableReader())
out.println(s);

Write a string as a file
new SimpleFileWriter(tmpFile).writeString("Hello World!");

write a String list as a utf-8 file, but keep the writer open after the writeLines() call. By default SimpleFileWriter closes the stream after any of its method calls.
new SimpleFileWriter
.Builder(
"foo.txt")
.encoding(
"utf-8")
.keepOpen(
true)
.build()
.writeLines(Collects.
newArrayList("Hello", "World"));

Save a web page as a file
new SimpleFileWriter("foo.txt").copyFromURL("http://google.com");

Get all jsp, tag files in a directory and sub directories
List<File> files = Files.crawlDirectory(
new File("pages"),
new Files.ExtensionFilter("jsp", "tag"));
There are more to write but it is the basic idea. neither API nor the code is stabilized. but if you are interested you can check the code. Beware.. There is shameless copying form Google collections and Apache Commons libraries.

Enhanced for loop and ConcurrentModificationException

A Java blog i read has this entry, asking why Java behaves differently on for-each loop and a regular list loop. Here is an example:

Suppose we have an ArrayList initialized:

 1     List<String> t = new ArrayList<String>();
2 t.add("Elma");
3 t.add("Armut");
4 t.add("Kiraz");
if we make a loop as this, and remove one element from the list,

 1     for (String s : t) {
2 if(s.equals("Elma"))
3 t.remove(s);
4 }
it throws

Exception in thread "main" java.util.ConcurrentModificationException at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) at java.util.AbstractList$Itr.next(AbstractList.java:343)

But, if we do it the pre-Java5 way, using iterators and removing it through an iterator, code works fine.

 1     for (Iterator<String> it = t.iterator(); it.hasNext();) {
2 if(it.next().equals("Elma"))
3 it.remove();
4 }
And, another way, if we do it without using the enhanced for loop, but accessing the list via indexes without using iterators:

 1     for (int i = 0; i < t.size(); i++) {
2 if(t.get(i).equals("Elma"))
3 t.remove(t.get(i));
4 }
this will also not throw that exception

So, why this is happening in the enhanced for loop? the answer is rather easy to guess, because we are using two different remove methods. First example is using the remove() method of the ArrayList, the second one is using the Iterator's remove() method. To understand what is going on, we should look at the code. Actually the exception itself is giving us the hint.

at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
at java.util.AbstractList$Itr.next(AbstractList.java:343)

First of all, the enhanced for each loop actually uses an Iterator, but we just don't see it. That is why they call it "synthetic sugar". As far as i know, during the compilation, iterator related code is embedded into the byte code. So, foreach code above is equivalent of this:

 1     for (Iterator<String> it = t.iterator(); it.hasNext();) {
2 String s = it.next();
3 if(s.equals("Elma"))
4 t.remove(s);
5 }

So, lets now go to the inner code of the ArrayList for Iterators (i am using JDK6, it has some differences from JDK5). ArrayList iterator() method uses the parent class AbstractList's iterator method. it instantiates a private Iterable class Itr.

  1     private class Itr implements Iterator<E> {
one of the member parameters of this class is called int expectedModCount, which is equivalent id modCount parameter of AbstractList. modCount is used for counting the modifications made in the list. So, when a Itr class is created, iterator will contain the current modification count parameter as the reference. If, there is a modification made in the list, such as addition, or removal, modCount will change. like ArrayList's add() method:

  1     public boolean add(E e) {
2 ensureCapacity(size + 1); // Increments modCount!!
3 elementData[size++] = e;
4 return true;
5 }

notice the comment, ensureCapacity method increments the modCount variable of the AbstractList.

In each next() method call of the iterator,

  1  final void checkForComodification() {
2 if (modCount != expectedModCount)
3 throw new ConcurrentModificationException();
4 }
metod is called, to check if there is a chance is made in the list. This throws the ConcurrentModificationException in case they do not match. This is because, Iterator's are staefull objects. They contain cursors which will be incremented by each next() call. iterator instance cannot follow the changes done by the List itself, like the list remove method, but if, the rmeove() method is called within the Iterator, since the state variables are changed accordingly it works fine. so this is the Itr's remove() method.

  1  public void remove() {
2 if (lastRet == -1)
3 throw new IllegalStateException();
4 checkForComodification();
5
6
try {
7 AbstractList.this.remove(lastRet);
8 if (lastRet < cursor)
9 cursor--;
10 lastRet = -1;
11 expectedModCount = modCount;
12 } catch (IndexOutOfBoundsException e) {
13 throw new ConcurrentModificationException();
14 }
15 }