Creating a cross tab in MySQL
Data stored in a database is often also useful for statistical purposes. If you own a web-shop you want to be able to create a report about turnover. You can get statistical information by using GROUP BY, eg.
SELECT DATE_FORMAT(invoice.date, '%M') AS `month`, COUNT(*) AS `invoice_count`, SUM(`invoice`.`amount`) AS `turnover` FROM `invoice` WHERE `date` BETWEEN '2008-01-01' AND '2008-12-31' GROUP BY MONTH(`invoice`.`date`)
month invoice_count turnover January 84 9532.26 February 141 20857.61 March 91 10922.71 April 112 15044.48 May 101 9676.60 June 137 12860.88 July 281 34291.20 August 191 26377.66 September 103 16324.78 October 99 12873.23
If you are selling a wide variety of products, you might like to see the turnover for each product category. You could do this with a simple GROUP BY as:
SELECT DATE_FORMAT(`invoice`.`date`, '%M') AS `month`, `category`.`description` AS `category`, COUNT(*) AS `product_count`, SUM(`invoice_product`.`amount`) AS `turnover` FROM `invoice` INNER JOIN `invoice_product` ON `invoice`.`id` = `invoice_product`.`invoice_id` LEFT JOIN `product` ON `invoice_product`.`product_id` = `product`.`id` LEFT JOIN `category` ON `product`.`category_id` = `category`.`id` WHERE `date` BETWEEN '2008-01-01' AND '2008-12-31' GROUP BY MONTH(`invoice`.`date`), `category`.`id`
month category product_count turnover January Hardware 62 4821.31 January Software 51 4419.41 January Cables 12 291.54 February Hardware 71 8408.93 February Software 101 11726.36 February Cables 17 312.32 February Other 2 410.00 March Hardware 21 2371.58 March Software 81 8238.81 March Cables 13 312.32 ...
This would give you each category in a different row, ordered by month. Though this contains all the information the format is far from nice. Instead you would like to have 1 row per month with each category as a column as the information about the invoices as well.
SELECT DATE_FORMAT(`invoice`.`date`, '%M') AS `month`, COUNT(DISTINCT `invoice`.`id`) AS `product_count`, COUNT(*) AS `invoice_count`, SUM(`invoice_product`.`amount`) AS `turnover`, SUM(`product`.`category_id`=1) AS `hardware_count`, SUM(IF(`product`.`category_id`=1, `invoice_product`.`amount`, 0)) AS `hardware_turnover`, SUM(`product`.`category_id`=2) AS `software_count`, SUM(IF(`product`.`category_id`=2, `invoice_product`.`amount`, 0)) AS `software_turnover`, SUM(`product`.`category_id`=3) AS `cables_count`, SUM(IF(`product`.`category_id`=3, `invoice_product`.`amount`, 0)) AS `cables_turnover`, SUM(`product`.`category_id`=4) AS `other_count`, SUM(IF(`product`.`category_id`=4, `invoice_product`.`amount`, 0)) AS `other_turnover` FROM `invoice` INNER JOIN `invoice_product` ON `invoice`.`id` = `invoice_product`.`invoice_id` LEFT JOIN `product` ON `invoice_product`.`product_id` = `product`.`id` LEFT JOIN `category` ON `product`.`category_id` = `category`.`id` WHERE `date` BETWEEN '2008-01-01' AND '2008-12-31' GROUP BY MONTH(`invoice`.`date`), `category`.`id`
month invoice_count turnover hardware_count hardware_turnover software_count software_turnover cables_count cables_turnover other_count other_turnover January 84 9532.26 62 4821.31 51 4419.41 12 291.54 0 0 February 141 20857.61 71 8408.93 101 11726.36 17 312.32 2 410.00 March 91 10922.71 21 2371.58 81 8238.81 13 312.32 0 0 ...
The big downside of this method is that you need to modify the query if a category is added. This can be solved though by dynamically creating the query in a PHP script.
If you want to do advanced statistical you should have a look at OLAP cubes. Pentaho is an open-source reporting app which supports MySQL. http://mondrian.pentaho.org/
All data is fictional. The SQL queries are untested.
31 Oct 2008 Arnold Daniels





There’s actually mysql-proxy code that makes crosstabs easy:
http://forge.mysql.com/tools/tool.php?id=147
Actually, this old article details how to use the database itself to build your new query.
http://dev.mysql.com/tech-resources/articles/wizard/index.html
It actually a long narrative, but you can skim it pretty quickly.
thanks for the clear explaination