Wednesday, February 4, 2009

Training a text classifier

When writing a text classification system you need to train it. Typically you have a corpus of good data that has been accurately pre-classified and this is what you throw at the system while it is learning the classification.

I came up with what I thought was a good analogy for an untrained text classifier: A genius amnesic. i.e. someone who initially knows nothing but learns lightening fast.

No comments:

Post a Comment