Browsing by Author "Thabtah, Fadi"
Now showing 1 - 8 of 8
Results Per Page
Sort Options
Item Open Access A Classification Rules Mining Method based on Dynamic Rules' Frequency(IEEE Computer Society, 2015) Qabajeh, Issa; Chiclana, Francisco; Thabtah, FadiRule based classification or rule induction (RI) in data mining is an approach that normally generates classifiers containing simple yet effective rules. Most RI algorithms suffer from few drawbacks mainly related to rule pruning and rules sharing training data instances. In response to the above two issues, a new dynamic rule induction (DRI) method is proposed that utilises two thresholds to minimise the items search space. Whenever a rule is generated, DRI algorithm ensures that all candidate items' frequencies are updated to reflect the deletion of the rule’s training data instances. Therefore, the remaining candidate items waiting to be added to other rules have dynamic frequencies rather static. This enables DRI to generate not only rules with 100% accuracy but rules with high accuracy as well. Experimental tests using a number of UCI data sets have been conducted using a number of RI algorithms. The results clearly show competitive performance in regards to classification accuracy and classifier size of DRI when compared to other RI algorithms.Item Open Access Constrained Dynamic Rule Induction Learning(Elsevier, 2016-06-24) Thabtah, Fadi; Qabajeh, Issa; Chiclana, FranciscoOne of the known classification approaches in data mining is rule induction (RI). RI algorithms such as PRISM usually produce If-Then classifiers, which have a comparable predictive performance to other traditional classification approaches such as decision trees and associative classification. Hence, these classifiers are favourable for carrying out decisions by users and hence they can be utilised as decision making tools. Nevertheless, RI methods, including PRISM and its successors, suffer from a number of drawbacks primarily the large number of rules derived. This can be a burden especially when the input data is largely dimensional. Therefore, pruning unnecessary rules becomes essential for the success of this type of classifiers. This article proposes a new RI algorithm that reduces the search space for candidate rules by early pruning any irrelevant items during the process of building the classifier. Whenever a rule is generated, our algorithm updates the candidate items frequency to reflect the discarded data examples associated with the rules derived. This makes items frequency dynamic rather static and ensures that irrelevant rules are deleted in preliminary stages when they don’t hold enough data representation. The major benefit will be a concise set of decision making rules that are easy to understand and controlled by the decision maker. The proposed algorithm has been implemented in WEKA (Waikato Environment for Knowledge Analysis) environment and hence it can now be utilised by different types of users such as managers, researchers, students and others. Experimental results using real data from the security domain as well as sixteen classification datasets from University of California Irvine (UCI) repository reveal that the proposed algorithm is competitive in regards to classification accuracy when compared to known RI algorithms. Moreover, the classifiers produced by our algorithm are smaller in size which increase their possible use in practical applications.Item Metadata only An Experimental Study for Assessing Email Classification Attributes Using Feature Selection Methods(IEEE, 2014) Qabajeh, Issa; Thabtah, FadiItem Metadata only An experimental study of three different rule ranking formulas in associative classification(Infonomics Society, 2012) Abdelhamid, Neda; Ayesh, Aladdin, 1972-; Thabtah, FadiItem Metadata only MAC: A Multiclass Associative Classification Algorithm(2012-06) Abdelhamid, Neda; Ayesh, Aladdin, 1972-; Thabtah, Fadi; Ahmadi, Samad; Hadi, WaelAssociative classification (AC) is a data mining approach that uses association rule discovery methods to build classification systems (classifiers). Several research studies reveal that AC normally generates higher accurate classifiers than classic classification data mining approaches such as rule induction, probabilistic and decision trees. This paper proposes a new multiclass AC algorithm called MAC. The proposed algorithm employs a novel method for building the classifier that normally reduces the resulting classifier size in order to enable end-user to more understand and maintain it. Experimentations against 19 different data sets from the UCI data repository and using different common AC and traditional learning approaches have been conducted with reference to classification accuracy and the number of rules derived. The results show that the proposed algorithm is able to derive higher predictive classifiers than rule induction (RIPPER) and decision tree (C4.5) algorithms and very competitive to a known AC algorithm named MCAR. Furthermore, MAC is also able to produce less number of rules than MCAR in normal circumstances (standard support and confidence thresholds) and in sever circumstances (low support and confidence thresholds) and for most of the data sets considered in the experiments.Item Metadata only Phishing detection based Associative Classification data mining(Elsevier, 2014-03-27) Abdelhamid, Neda; Ayesh, Aladdin, 1972-; Thabtah, FadiWebsite phishing is considered one of the crucial security challenges for the online community due to the massive numbers of online transactions performed on a daily basis. Website phishing can be described as mimicking a trusted website to obtain sensitive information from online users such as usernames and passwords. Black lists, white lists and the utilisation of search methods are examples of solutions to minimise the risk of this problem. One intelligent approach based on data mining called Associative Classification (AC) seems a potential solution that may effectively detect phishing websites with high accuracy. According to experimental studies, AC often extracts classifiers containing simple “If-Then” rules with a high degree of predictive accuracy. In this paper, we investigate the problem of website phishing using a developed AC method called Multi-label Classifier based Associative Classification (MCAC) to seek its applicability to the phishing problem. We also want to identify features that distinguish phishing websites from legitimate ones. In addition, we survey intelligent approaches used to handle the phishing problem. Experimental results using real data collected from different sources show that AC particularly MCAC detects phishing websites with higher accuracy than other intelligent algorithms. Further, MCAC generates new hidden knowledge (rules) that other algorithms are unable to find and this has improved its classifiers predictive performance.Item Metadata only Prediction phase in associative classification mining.(World Scientific Publishing, 2011) Thabtah, Fadi; Hadi, Wael; Abdelhamid, Neda; Issa, AymanItem Open Access A recent review of conventional vs. automated cybersecurity anti-phishing techniques(Elsevier, 2018-06-28) Qabajeh, Issa; Thabtah, Fadi; Chiclana, Francisco"In the era of electronic and mobile commerce, massive numbers of financial transactions are conducted online on daily basis, which created potential fraudulent opportunities. A common fraudulent activity that involves creating a replica of a trustful website to deceive users and illegally obtain their credentials is website phishing. Website phishing is a serious online fraud, costing banks, online users, governments, and other organisations severe financial damages. One conventional approach to combat phishing is to raise awareness and educate novice users on the different tactics utilised by phishers by conducting periodic training or workshops. However, this approach has been criticised of being not cost effective as phishing tactics are constantly changing besides it may require high operational cost. Another anti- phishing approach is to legislate or amend existing cyber security laws that persecute online fraudsters without minimising its severity. A more promising anti-phishing approach is to prevent phishing attacks using intelligent machine learning (ML) technology. Using this technology, a classification system is integrated in the browser in which it will detect phishing activities and communicate these with the end user. This paper reviews and critically analyses legal, training, educational and intelligent anti-phishing approaches. More importantly, ways to combat phishing by intelligent and conventional are highlighted, besides revealing these approaches differences, similarities and positive and negative aspects from the user and performance prospective. Different stakeholders such as computer security experts, researchers in web security as well as business owners may likely benefit from this review on website phishing."