added a comment - - edited
http://en.wikipedia.org/wiki/Byte-order_mark
Unicode files can use the first byte of the file to signify that it is in fact a unicode file.. the whole thing is a mess, sorry to drag you into character encoding hell, but we are almost totally out. =)
A simple test case is attached. Edit the pom for your DB config. The first test that does not have the byte order mark passes. However, the second test fails on the mark, it would be sweet if the plugin knew to skip it.
There is a lot of inconsistency right now regarding the BOM. Certian tools are practically forcing us to use the mark, but at the exact same time, the same people that are saying "it's good, we want you to use it", the other half of their tools don't support.. go figure
quick outline of fix:
void sendSQLFile() {
boolean isUnicode = config.encoding. != null && config.encoding.substring(0, 3) == "UTF";
char c = readTheFirstCharOfFile();
if (c == 0xFEFF&& isUnicode && config.unicodeFilesHaveByteOrderMark)
{ // 0xFEFF is the literal for the BOM, maybe add a config parameter for this defaulted to false?
// ignore c;
}
readRestOfFile();
}
what is BOM? can you scale down small pom file that can reproduce the issue?