unique drugs: 148,610
unique side effects: 12,880
total reports: 665,470
We investigated how data mining techniques can be used on reported side effects patients experience during medical drug therapy to determine which drugs are causing which side effects and to determine which drugs react with each other.
The benefit of using Data Mining in a medical field is such that as a number of prescription drugs increase every year, there is a risk of negative interactions between them. We decided to test that hypothesis and see if we can capture statistically significant (support + confidence + correlation filters) combinations of drugs that cause adverse side effects. I optimized the Apriori algorithm in such a way that it was still checking for all possible combinations of drugs but had a considerably smaller candidate pool to combine for each side effect.
Aside from capturing all of the dangerous drugs (most of which are now banned or are well known for their disastrous side effects), we also achieved immaculate algorithm complexity performance due to smart optimizations.
Original research was done by bmcbioinformatics. They had a dataset of 162,744 individual reports, while we had 4 times as much. Their Algorithm ran in around 4 hours until completion, while ours completed in under 4 minutes.
- Accuracy & Usefulness:
Most of the drugs that we identified as malicious were already well known to the public. However, installing our system directly into the database of medical reports would have resulted in a benefit of capturing such malicious outliers right away as opposed to years later with thousands (Ex: Rofecoxib: “resulting in between 88,000 and 140,000 cases of serious heart disease”) of victims in history.
- The good news
Our hypothesis that a lot of drugs will have negative interactions between them was wrong. We only found one pair of drugs from 12,880 that had significant interactions between them.
More details in our paper.