Tag Archives: kaggle

Gradient Boosted Trees Notes

Gradient Boosted Trees (GBT) is an ensemble mechanism which learns incrementally new trees optimizing the present ensemble's residual error.  This residual error is resemblance to a gradient step of a linear model. A GBT tries to estimate gradient steps by a new tree and update the present ensemble with this new tree so that whole model is updated in the optimizing direction. This is not very formal explanation but it gives my intuition.

One formal way to think about GBT is, there are all possible tree constructions and our algorithms is just selects the useful ones for the given data.  Hence, compared to all possible trees,  number of tress constructed in the model is very small. This is similar to constructing all these infinite  number of trees and averaging them with the weights estimated by  LASSO.

GBT includes different hyper parameters mostly for regularization.

  • Early Stopping : How many rounds your GBT continue.
  • Shrinkage : Limit the update of each tree with the coefficient 0 < alpha < 1
  • Data subsampling: Do not use whole the data for each tree, instead sample instances. In general sample ration  n = 0.5 but it can be lower for larger datasets.
  • One side note: Subsampling without shrinkage performs poorly.

Then my initial setting is:

  • Run pretty long with many many round observing a validation data loss.
  • Use small shrinkage value alpha = 0.001
  • Sample 0.5 of the data
  • Sample 0.9 of the features as well or do the reverse.

Kaggle Plankton Challenge Winner's Approach

I recently attended Plankton Classification Challenge  on Kaggle. Even tough I used simpler (stupidly simpler compared to the winner) Deep NN model for my submissions and ended up at 192th position among  1046 participants. However, this was very good experiment area for me to test new comer ideas to Deep Learning community  and try some couple of novel things which I plan to explain later in my blog.

In this post, I share my notes about the winner's approach (which is explained here extensively).