Maximum für Gruppen von Daten ermitteln in MySQL

Heute stand ich (wahrscheinlich zum xten Mal) vor dem Problem in MySQL den maximalwert pro Gruppierung zu ermitteln. Mein Beispiel: Kunden kaufen Artikel – ich will wissen welchen Artikel jeder Kunde am meisten gekauft hat. Kurze Suche im Internet hat mich auf folgende Seite geführt: Wie bekommt man die Zeile mit dem Maximalwert pro Gruppe? Auf dieser Seite werden zwei Möglichkeiten vorgestellt. Eine mit Subselect und eine mit Join. Als jemand der doch ein wenig Erfahrung mit MySQL hat (denke ich zumindest), weiß ich, das Subselects böse sind und oft mit Joins ersetzt werden könnnen, welche schneller sind – zum Teil extrem schneller. Das wollte ich doch gleich mal an meinem praktischen Beispiel ausprobieren.

Meine Tabelle hat folgende Struktur:

  • id – INT – Primary Key als auto increment
  • customer_id – INT – Kunden ID
  • qty_sum – INT – Summierte Anzahl für Produkt und Kunde
  • product_id – INT – Produkt ID

Also sehr einfach.

Das Beispiel von der Webseite oben umgeschrieben auf meine Tabelle in beiden Ausführungen. Zuerst als Subselect:

SELECT t.*
FROM tmp t
WHERE t.qty_sum = (
SELECT MAX(t2.qty_sum) FROM tmp t2
WHERE t2.customer_id=t.customer_id
)

Und dann als Join-Variante:

SELECT t.*
FROM tmp t
LEFT JOIN tmp t2 ON t2.customer_id=t.customer_id AND t2.qty_sum > t.qty_sum
WHERE t2.qty_sum IS NULL

In diesem Beispiel (Join) wird der Eintrag gesucht, für den es keinen größere qty_sum gibt. Das muss das Maximum sein.

Wie im Link oben erwähnt, haben beide Queries ein Problem, wenn es die größte Anzahl mehrmals gibt, z.B. wenn ein Kunde zwei verschiedene Produkt jeweils drei mal gekauft hat. Dann gibt es für diesen Kunden zwei Zeilen im Ergebnis. Wenn das nicht gewollt ist und die Auswahl, welche Zeile genommen wird, egal ist, dann einfach ein
GROUP BY t.customer_id
and das SQL hängen. Funktioniert mit beiden Versionen (Join und Subselect).

Aber jetzt zur Frage, welche Version schneller ist. Zur Info. Meine Tabelle enthält fast 400.000 Zeilen. Ist also nicht ganz klein aber auch nicht riesig groß. Ich habe in beiden Queries ein SQL_NO_CACHE hinzugefügt, damit (hoffentlich) der Query Cache keine Rolle spielt.

Dauer für Subselect: (unklar, wurde nach 1200 Sekunden abgebrochen – also 20 Minuten)
Dauer für Join: (unklar, wurde nach 1200 Sekunden abgebrochen – also 20 Minuten)

Oh, oh. Das ist aber lange. Viel zu lange. So lange das ich nicht mehr warten wollte und das einfach mal abgebrochen habe. Wer etwas Ahnung hat, der wird sicherlich schon merken, dass hier ein Index fehlt. Dadurch dürfte man die Query-Zeit deutlich verkürzen. Also einen Index auf customer_id erstellt und siehe da: Es geht viel schneller.

Dauer für Subselect mit Index (customer_id): rund 3 Sekunden
Dauer für Join mit Index (customer_id): rund 2,6 Sekunden

Das ist ja schon deutlich besser. Damit könnte man bei einer Auswertung leben. Wenn man das natürlich live irgendwo anzeigen will, wäre das immer noch etwas lange. (In diesem Fall würde ich die Ergebnisse regelmäßig in eine Zwischen-Tabelle packen und dann von dort aus direkt laden.)

Was passiert, wenn ich einen Index erstelle, der nicht nur customer_id, sondern auch noch den anderen benutzten Wert (qty_sum) enthält. Also einen zweiten Index erstellt mit customer_id und qty_sum. (MySQL ist nicht in der Lage, in einem Query zwei verschiendene Index zu nutzen.) Wenn man jetzt die Selects wieder laufen lässt, brauchen Sie genauso lange wie eben auch. Ein EXPLAIN ergibt dann auch, dass der zweite Index mit customer_id und qty_sum gar nicht genutzt wird, sonder weiterhein der Index nur mit customer_id genutzt wird.

Also was kann man in diesem Fall als Ergebnis hinstellen: Jojn oder Subselect ist egal. Beide gleich schnell bzw. langsam. Aber extrem wichtig ist der Index.

Swype für iOS Emoticons (auf Deutsch)

Swype ist eine alternative Tastatur für iOS. Im Gegensatz zu normalen Tastaturen tippt man hier nicht auf die Buchstaben, um ein Wort zu schreiben, sondern wischt darüber. Ob man das mag oder nicht hängt von den persönlichen Vorlieben ab. Ich mag es sehr und komme sehr gut damit klar. Die App gibts im iTunes Store und kostet dort 0,99€.

Ein kleines Problem hat man mit Emoticons und Smileys. Das ist nicht so ganz optimal gelöst. Bisher habe ich immer zu einer anderen Tastatur gewechselt, um dann die Smileys einzufügen. Der häufige Wechsel stört aber. Man kann aber auch mit Swype direkt Emoticons schreiben. Dazu muss man nur den Namen kennen, dann wird in der Vorschlagsleiste der entsprechende Smiley angezeigt. Leider habe ich keine komplette Liste gefunden, deswegen hier mal die Emoticons und Smileys, die ich bis jetzt gefunden habe. Solltet ihr weitere kennen, dann schreibt doch bitte einen Kommentar. Ich werde es dann ergänzen.

Hinter der Grafik steht der Text, mit dem man das Emoticon erzeugen kann. Falls dort mehrere Wörter stehen, dann reicht eins davon aus. Probiert’s einfach mal aus.

Smileys
smileys

Natur
natur

Essen
essen

Sonstiges
sonstiges

Grails error No signature of method is applicable for argument types when calling a service

I just spent an hour trying to figure out what went wrong with calling a method in a grails service and giving an list to the method. I always got the message

No signature of method: static ProductService.loadSomething() is applicable for argument types: (java.util.ArrayList) values: [[169646, 78872, 169642]] Possible solutions: loadSomething(java.lang.Object)

To me everything seemed correct. I even change the method loadSomething to explicitly expect an java.util.ArrayList. Did not work. I was nearly giving up, when realized the error was a simple type mistake for the service. You need to define a service, before using it:


class Abc {
def productService

def doSomething() {
productService.loadSomething( ... )
}
}

This is the way it works. My simple mistake (or maybe some auto-completion issue with my IDE) was to change “productService” into “ProductService” in the doSomething-method. See the big “P” at the beginning. The error message was not very helpful here, because you think the argument is wrong, but it isn’t the problem. So here again the not working code:


class Abc {
def productService

def doSomething() {
ProductService.loadSomething( ... )
}
}

Maybe groovy/grails developers could create a better error message in this case. Would have saved me a lot of time.

Compare git vs. mercurial

Software revision control systems are nice. They help reduce a lot of manual work – at least they should. And they often do. I have to admit that I still use subversion at work. It works quite well normally. Yesterday was one of the days, where it didn’t. We had a complex new feature created in a branch and I was the one to merge that branch into the trunk. That works flawless, if you never changed the trunk while working in the branch. Of course that’s not how things go in real life. So the trunk and branch got a few changes in the same place. And that normally means a merge conflict and manual handling. You will probably never get rid of this completely. But it’s still sometimes quite difficult, especially if the code was written/changed by someone else. And subversion is often not very good in merging. Especially if things got attached at the end of the same file in the trunk and the branch (take a css-file, where new styles got added). Subversion for some unknown reason is not able to merge that.

So the decision to switch to another revision control system grew. And there are a lot out there. Especially the distributed revision control systems do have some advantage. There are several but two seem to get the most attraction. These are git and mercurial. I know there are others out there. But I will stick to these two.

A distributed revision control system means, that every developer (or anyone with access to the system) gets a complete copy of the whole repository. Meaning that you do not need a server to do things like commit/branch. It’s nice if you are somewhere without an internet access (at least temporarily). And all these actions happen local, which makes them fast. If you’re working in a group you would normally have one (or more) shared main repositories. And you have an extra step of pushing your locally committed changes to the shared repository. Something to get used to, but not impossible.

So after yesterdays decision I started reading about these two system today. And looked at several comparisons. I learned quite a few things but I’m not able to tell, which system is the better one. Sorry to disappoint you, if you’re expecting an answer from me. I’m going to give a short summary of the differences I’ve found and maybe a hint of what to choose. Before going into detail, I want to make clear, that I’ve not tried the system. At least not enough to give a real using point of view.

What I’ve learned so far is that git seams to be the more complex system. While mercurial does hide a lot of complexity under the hood. It’s there in mercurial, but you won’t see it immediately. Only if you want to. That’s a point for mercurial. Some of the more advanced features in mercurial are available through plugins. That also makes git a bit harder to learn.

Git seems to give you more freedom, especially when changing your revision history. That’s a more or less basic thing with git, but not so common in mercurial, but still possible. Git seems to be a bit more flexible with branching, especially when you’re about to make a branch for every new feature or idea. (Something you surely wouldn’t do in subversion.) There are a few other things where git seems to be the more flexible system.

So the more flexible system with git seems to number higher, than the easier to learn and use system of mercurial. But wait a second. There’s other software, that has a huge amount of features, a lot you might even not know about. Even if you’re using the software day by day. Remember that word processor software from a very big software company. There’s so much features in that software, I guess nobody even knows about it all. And they changed the user interface to “ribbons” to hide that lot of features even better than before. (At least I’m searching around all the time for a feature that’s there, but I’ve know clue where.)

The point I want to make is this: more features is not better. It’s only better if you really do use the features. Than it’s good. If you don’t use them, you won’t need them. Maybe you even do not know they exist. And than there’s no point in getting the more complex system anyway.

So how to decide what to use. I think the right way to do this, is not to compare features. But to figure out what your workflow does look like – or maybe what you wan’t it to look like. If you know what workflow you want, you know the features you want – and probably some you do not want. And than compare the systems agains your feature set – and choose the one that fits your features best. Without adding to much “not needed” complexity.

That’s something I’ll have to do tomorrow. But enough for now.

Using external configuration files in Grails with Tomcat

I’ve already described in a previous post, how to use external configuration files with grails. For a while now I’ve tried to figure out how to be able to have different configuration files for different tomcat hosts. Saddly, none of my approches worked. Yesterday I’ve stumbled upon a blog post, that gave me hope again: Grails config values per tomcat host. Though I had to do some little changes/additions to the solution, I’ve finally got this working.

This is what needs to be put as context into the server.xml (not recommended) or in some xml file under your catalina-host-directory (i.e. /etc/tomcat/Catalina/myapp.local/context.xml):

<Context path="" appBase="/var/tomcat/enamora-tracking">

<Environment name="grailsExtConfFile" value="file:/var/tomcat/enamora-tracking/conf/enamora.groovy" type="java.lang.String"/>

</Context>

Please adjust path names accordingly.

You can do this for as many tomcat-hosts as you want, with different config-files.

Now to load this in your grails application, change the code, that loads the config file in your Config.groovy to this:
def ENV_NAME = "APPNAME_CONFIG"
if(!grails.config.locations || !(grails.config.locations instanceof List)) {
grails.config.locations = []
}

// this loads your tomcat host config file
try {
def exConfig = ((Context)(new InitialContext().lookup(“java:comp/env”))).lookup(“grailsExtConfFile”)
println “Including configuration file specified in tomcat context: ” + exConfig
grails.config.locations << exConfig
} catch (Exception e) {
println(“External configuration lookup failed: ” + e.getMessage())
}

if(System.getProperty(ENV_NAME)) {
println “Including configuration file specified on command line: ” + System.getProperty(ENV_NAME);
grails.config.locations << “file:” + System.getProperty(ENV_NAME)
} else if(System.getenv(ENV_NAME)) {
println “Including configuration file specified in environment: ” + System.getenv(ENV_NAME);
grails.config.locations << “file:” + System.getenv(ENV_NAME)
} else {
println “No external configuration file defined.”
}

You might need to add two import statements to your Config.groovy. Simply add them at the beginning of Config.groovy:

import javax.naming.Context
import javax.naming.InitialContext

Grails: Using a map in a view (gsp)

I recently run into a problem using a map in a gsp with Grails. I had to fill the map myself with keys that where numeric (and in my test situation they even ranged from 1 to 4). When using the map in the view (gsp) I tried to check for existence of the corresponding element with
<g:if test="${myMap.containsKey(someKey)}">

For some weird reason this didn’t work. After a while I’ve figured out, that grails transformed the map into a HashMap. For a HashMap you need to know which type the keys and contents are. Well I didn’t. Sometimes integer worked, sometimes not. Not really funny.

After some searching I’ve found a hint, that you should always use String as type for the key. Pure and simple a String (and not a GString). So I’ve changed my map accordingly with the .toString()-method and used .toString() on the keys in the view to access the map elements and now it works like a charm.

That’s probably one of the drawbacks of a type-less kind of language. Sometimes you will get a type you never asked for.

Logging groovy.sql.Sql queries with grails

It seems like groovy.sql.Sql is not using the default log4j-logger. So logging these queries doesn’t simply work. To get this to work I figured out the following (with the help of this article, which didn’t work exactly).

Add a line to the log4j-part in Config.groovy:
trace 'groovy.sql.Sql'

Make sure you have the following lin in Config.groovy for your environemnt (or general):
grails.logging.jul.usebridge = true

This enables logging for groovy.sql.Sql in general. But you might not get anything, because the logger used by groovy.sql.Sql has still the default log level. So you might need to specifiy the log-level, when you have your groovy.sql.Sql-object:

mySql = new Sql(dataSource)
mySql.LOG.level = Level.FINE

Saving a domain class fails silently

I had a problem saving a domain class, which should have worked without problems in my opinion. But no entry was saved in the database and no exception (not even a warning) was thrown. Maybe I was setting wrong values to the domain class, but how to know for sure what’s wrong, if no message is given.

Finally I’ve found a quite useful argument to add to the save-method: failOnError:true
I do not know why I’ve never read about this before, but know if the save fails, I do get the error with a clear description what’s wrong. I’m actually wondering why this is not the default. Because you normaly want to know, if a save was sucessful or not.

In my special case the domain class had a self assigned id value. So even getting the id of the class was possible, without it ever getting save in the database. If you want this behavior not only for a certain save-call, but for all save-calls, you can add an option to Config.groovy: grails.gorm.failOnError=true (see here)

Using external configuration file in Grails

If you’re looking on using different config files with tomcat, have a look at my other blog post: Using external configuration files in Grails with Tomcat.

At the moment I’m working my way into Grails. Having some experience with Java but originally coming from PHP, Grails really is a very nice Framework for the Groovy language. The guide to Grails is quite useful and covers a lot of things, but still some things are unsaid or not thoroughly handled in the guide.

One thing I came across is externalizing configuration. Grails deploys its application as a war file. The standard Grails-way would be to include all configurations in this war file. If some configuration entries change (let’s say the database password), you would need to recompile and deploy the war file again. Not good. Also you would write all your passwords in clear text into the configuration files, which would often be stored in some versioning system allowing access to a lot of people – and some should better not know your live database access data.

There is a short passage about externalization of configuration entries in the grails documentation. But it wasn’t exactly what I was looking for. So I searched a bit and found the following blog-post:
phatness.com/2010/03/how-to-externalize-your-grails-configuration/

When I read this, I though: wow. That’s it. Problem is: it might have worked for earlier versions of grails, but not for the version I’m using (grails 2.3.3).

What’s wrong or how to do it better? First it needs to read “grails.config.locations” with an “s” at the end! And as one comment states, you would want to switch the if-else-parts around. So command-line configuration overrides environment configurations.

Here’s my version. It all goes into the beginning of the Config.groovy-file.
def ENV_NAME = "APPNAME_CONFIG"
if(!grails.config.locations || !(grails.config.locations instanceof List)) {
grails.config.locations = []
}
if(System.getProperty(ENV_NAME)) {
println "Including configuration file specified on command line: " + System.getProperty(ENV_NAME);
grails.config.locations << "file:" + System.getProperty(ENV_NAME) } else if(System.getenv(ENV_NAME)) { println "Including configuration file specified in environment: " + System.getenv(ENV_NAME); grails.config.locations << "file:" + System.getenv(ENV_NAME) } else { println "No external configuration file defined." }

Now how do give an command line file (not on the live system, but on a local test)? You need to call grails with the -D-option like this:
grails -Dserver.port=8090 -DAPPNAME_CONFIG=/path/.../myconfig.groovy run-app

What I do is to let the server run on a different port (8090), because I already have a Tomcat-Server running on 8080. And I set the APPNAME_CONFIG-setting on the command line to the full path of my local groovy-config file. And finally I use run-app to start the app right away. Sadly this doesn't work if you do not run the app directly.

Now how does my myconfig.groovy look like? You'll get a stripped down version.
println 'Config loading'
dataSource {
pooled = true
url = "jdbc:xyz"
driverClassName = "driver"
dialect = org.hibernate.dialect.blabla
username = "123"
password = "456"
}
abc {
options {
option1 = "123"
option2 = "12345"
}
}
println 'Config loaded'

It basically looks like a normal groovy config file. What I really like is the ability to add println statements to the config file. By adding one at the beginning and end you can make sure, that your file is read completly. And to my knowledge there is no other way. First I had a mistake in my path the my config file. Grails did start without any message, that the file was not found - but didn't load it either.
Even if you make mistakes in the config file, grail will silently ignore it!

Now how do you access your own configuration-options in Grails?
You need to inject grailsApplication into your code - and through this object you could access your properties:

class SomeClass {
def grailsApplication

def someMethod() {
def myCustomOption1 = grailsApplication.config.abc.options.option1
}
}

This is my version of externalized configuration in Grails. It works quite well for me. Of course you have to restart the grails server, if your configuration file changed. But that's live.

Branching with Subversion and PHPStorm

PHPStorm is a PHP IDE from jetbrains. I do like it quite a lot. Compared to Eclipse and Netbeans it works better for me (I did not try current versions of Eclipse or Netbeans). PHPStorm is not free and I’m not using the most current version of it (5.0).

I recently came across some problem with merging a branch back into trunk. I do not know what exactly happened and how I managed to do this, but after the merge it had removed all changes that happened to the trunk between creation of the branch and the time of the merge from the result. I thought that everything was good and it looked so. Only that a week later I realized, that some feature was missing in the trunk. I’m not quite shure why this happened and what exactly I did wrong.

So today I tried to figure out, how to merge a branch back into the trunk with PHPStorm and Subversion. And after a few not so successful tries I did got it done.

Here’s what I did. (If you want to test this, do not use an important productive repository. Create one explicitly for testing.)

  • First I created a branch from the trunk. (Right-click on the Project in the project explorer -> Subversion -> Branch or Tag …) The trunk should already consist of some code.
  • Stay in the trunk and change a few files and commit them.
  • Switch to the branch (you need to update your project and choose a different url)
  • Change a few things and commit them

This is just for creating the basics. Now to merge the branch back into the trunk, do the following:

  • Make sure you are in the trunk and update to the head revision
  • It’s advised to commit all local changes into the trunk, though not necessary (but makes life a lot easier if for some reason you need to revert changes)
  • Right click on the Project in the Project-Explorer -> Subversion -> Integrate Directory…
  • Source 1 and Source 2 must contain the branch! Change it, if this is not the case
  • Change the revision for Source 1 to the creation revision for the branch. Source 2 revision could normally stay with HEAD.
  • Click the “Try merge, but make no changes” checkbox, if you want a dry run. This will show you, what’s going to happen and what conflicts might come up.
  • Do it again without the “Try merge” checkbox checked.

Now you should be done with the merge. You might have to resolve conflicts, if they occur.

If this is not successful, check your Source 1 and Source 2 again. If this does not contain the branch, you will get weird results. If for some reason PHPStorm tells you, that all files are up to date (and you are sure that changes in the branch exist) you might need to uncheck the “use ancestry” checkbox. PHPStorm seems to remember what you already merge and will normally not merge this again. I run into this problem after merging and than doing a revert. Tried to merge again but no changes came up. Not using the ancestry solved the problem for me.