URL
About this prototype
This is a prototype of how classifiers can be used to detect spam blogs (splogs). Right now two classifiers cooperate
to identify spam. If this idea shows potential we will be using clusters of dynamic classifiers for spam identification.
Urls are also detected directly however more urls are needed for the training data.
Known Issues
- Training data is quite old.
- More training data required.
- Will mistake most non english blogs for spam.
- Doesn't work as well on forums.
- Can have long response times.
- URL corpus is too small.
API