Text this: Analysing the content of web 2.0 documents by using a hybrid approach /