What is it about?

Traditional public health surveillance methods such as those employed by the CDC (United States Centers for Disease Control and Prevention) rely on regular clinical reports, which are almost always manual and labor intensive. Twitter, a popular micro-blogging service, provides the possibility of automated public health surveillance. Tweets, however, are less than 140 characters, and do not provide sufficient word occurrences for conventional classification methods to work reliably. Moreover, natural language is complex. This makes health-related classification more challenging. In this study, we use flu-related classification as a demonstration to propose a hybrid classification method, which combines two classification approaches: manually- defined features and auto-generated features by machine learning approaches. Preprocessing based on Natural Language Processing (NLP) is used to help extract useful information, and to eliminate noise features. Our simulations show an improved accuracy.

Featured Image

Why is it important?

1. Efficiency: (A) We manually predefined feature words for rule-based classifiers based on common sense and expert opinions. Normal machine learning methods rely on the word frequency or enough word co-occurrences; they usually fail to evaluate uncommon and special words. These rule-based classifiers are effective and quickly to identify flu-related tweets (B) We use Naive Bayes model for the machine learning classifier. It assumes that the features are all independent, which means computationally efficiency. 2. Hybrid: Our Hybrid architecture takes advantage of the multiple approaches (NLP preprocessing, rule-based classifiers and machine learning classifier). It achieved better results than any single approach.

Read the Original

This page is a summary of: Hybrid classification for tweets related to infection with influenza, April 2015, Institute of Electrical & Electronics Engineers (IEEE),
DOI: 10.1109/secon.2015.7133015.
You can read the full text:

Read

Contributors

The following have contributed to this page