Incremental information gain analysis of input attribute impact on RBF-kernel SVM spam detection

He, H; Tiwari, A; Mehnen, J; Watson, T; Maple, C; Jin, Y; Gabrys, B

Incremental information gain analysis of input attribute impact on RBF-kernel SVM spam detection

He, H Tiwari, A Mehnen, J Watson, T Maple, C Jin, Y Gabrys, B

Permalink

Publisher:: IEEE
Publication Type:: Conference Proceeding
Citation:: 2016 IEEE Congress on Evolutionary Computation, CEC 2016, 2016, pp. 1022-1029
Issue Date:: 2016-11-14

Closed Access

	Filename	Description	Size
	Incremental_information_gain_analysis_of_input_attribute_impact_on_RBF-kernel_SVM_spam_detection.pdf	Published version	289.21 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	He, H
dc.contributor.author	Tiwari, A
dc.contributor.author	Mehnen, J
dc.contributor.author	Watson, T
dc.contributor.author	Maple, C
dc.contributor.author	Jin, Y
dc.contributor.author	Gabrys, B https://orcid.org/0000-0002-0790-2846
dc.date	2016-07-24
dc.date.accessioned	2023-02-23T06:59:09Z
dc.date.available	2023-02-23T06:59:09Z
dc.date.issued	2016-11-14
dc.identifier.citation	2016 IEEE Congress on Evolutionary Computation, CEC 2016, 2016, pp. 1022-1029
dc.identifier.isbn	9781509006229
dc.identifier.uri	http://hdl.handle.net/10453/166400
dc.description.abstract	The massive increase of spam is posing a very serious threat to email and SMS, which have become an important means of communication. Not only do spams annoy users, but they also become a security threat. Machine learning techniques have been widely used for spam detection. Email spams can be detected through detecting senders' behaviour, the contents of an email, subject and source address, etc, while SMS spam detection usually is based on the tokens or features of messages due to short content. However, a comprehensive analysis of email/SMS content may provide cures for users to aware of email/SMS spams. We cannot completely depend on automatic tools to identify all spams. In this paper, we propose an analysis approach based on information entropy and incremental learning to see how various features affect the performance of an RBF-based SVM spam detector, so that to increase our awareness of a spam by sensing the features of a spam. The experiments were carried out on the spambase and SMSSpemCollection databases in UCI machine learning repository. The results show that some features have significant impacts on spam detection, of which users should be aware, and there exists a feature space that achieves Pareto efficiency in True Positive Rate and True Negative Rate.
dc.language	en
dc.publisher	IEEE
dc.relation.ispartof	2016 IEEE Congress on Evolutionary Computation, CEC 2016
dc.relation.ispartof	IEEE Congress on Evolutionary Computation (CEC) held as part of IEEE World Congress on Computational Intelligence (IEEE WCCI)
dc.relation.ispartofseries	IEEE Congress on Evolutionary Computation
dc.relation.isbasedon	10.1109/CEC.2016.7743901
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	Incremental information gain analysis of input attribute impact on RBF-kernel SVM spam detection
dc.type	Conference Proceeding
utslib.location.activity	Vancouver, CANADA
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - CHT - Health Technologies
pubs.organisational-group	/University of Technology Sydney/Strength - AAI - Advanced Analytics Institute Research Centre
pubs.organisational-group	/University of Technology Sydney/Centre for Health Technologies (CHT)
utslib.copyright.status	closed_access	*
dc.date.updated	2023-02-23T06:59:08Z
pubs.finish-date	2016-07-29
pubs.publication-status	Published
pubs.start-date	2016-07-24

Abstract:

The massive increase of spam is posing a very serious threat to email and SMS, which have become an important means of communication. Not only do spams annoy users, but they also become a security threat. Machine learning techniques have been widely used for spam detection. Email spams can be detected through detecting senders' behaviour, the contents of an email, subject and source address, etc, while SMS spam detection usually is based on the tokens or features of messages due to short content. However, a comprehensive analysis of email/SMS content may provide cures for users to aware of email/SMS spams. We cannot completely depend on automatic tools to identify all spams. In this paper, we propose an analysis approach based on information entropy and incremental learning to see how various features affect the performance of an RBF-based SVM spam detector, so that to increase our awareness of a spam by sensing the features of a spam. The experiments were carried out on the spambase and SMSSpemCollection databases in UCI machine learning repository. The results show that some features have significant impacts on spam detection, of which users should be aware, and there exists a feature space that achieves Pareto efficiency in True Positive Rate and True Negative Rate.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/166400