Empirical likelihood confidence intervals for differences between two datasets with missing data

Qin, Y; Zhang, S

Empirical likelihood confidence intervals for differences between two datasets with missing data

Qin, Y Zhang, S

Permalink

Publisher:: Elsevier
Publication Type:: Journal Article
Citation:: Pattern Recognition Letters, 2008, 29 (6), pp. 803 - 812
Issue Date:: 2008-01

Closed Access

	Filename	Description	Size
	2009005045OK.pdf		196.04 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Qin, Y	en_US
dc.contributor.author	Zhang, S	en_US
dc.date.issued	2008-01	en_US
dc.identifier.citation	Pattern Recognition Letters, 2008, 29 (6), pp. 803 - 812	en_US
dc.identifier.issn	0167-8655	en_US
dc.identifier.uri	http://hdl.handle.net/10453/9007
dc.description.abstract	Detecting differences between populations (or datasets) is an important research topic in machine learning, yet an common application means of evaluating, such as a new medical product by comparing with an old one. Previous researchers focus on change detection. In this paper, we measure the uncertainty of structural differences, such as mean and distribution function differences, between populations, using a confidence interval (CI), via an empirical likelihood approach. We present a statistically sound method for estimating CIs for differences between non-parametric populations with missing values, which are imputed by using simple random hot deck imputation method. We illustrate the power of CI estimation as a new machine learning technique for, such as, distinguishing spam from non-spam emails in spambase dataset downloaded from UCI.	en_US
dc.publisher	Elsevier	en_US
dc.relation	http://purl.org/au-research/grants/arc/DP0559536
dc.relation	http://purl.org/au-research/grants/arc/DP0667060
dc.relation.ispartof	Pattern Recognition Letters	en_US
dc.relation.isbasedon	10.1016/j.patrec.2007.12.010	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	Empirical likelihood confidence intervals for differences between two datasets with missing data	en_US
dc.type	Journal Article
utslib.citation.volume	6	en_US
utslib.citation.volume	29	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
utslib.for	0906 Electrical and Electronic Engineering	en_US
utslib.for	1702 Cognitive Sciences	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	closed_access
pubs.consider-herdc	false	en_US
pubs.issue	6	en_US
pubs.volume	29	en_US

Abstract:

Detecting differences between populations (or datasets) is an important research topic in machine learning, yet an common application means of evaluating, such as a new medical product by comparing with an old one. Previous researchers focus on change detection. In this paper, we measure the uncertainty of structural differences, such as mean and distribution function differences, between populations, using a confidence interval (CI), via an empirical likelihood approach. We present a statistically sound method for estimating CIs for differences between non-parametric populations with missing values, which are imputed by using simple random hot deck imputation method. We illustrate the power of CI estimation as a new machine learning technique for, such as, distinguishing spam from non-spam emails in spambase dataset downloaded from UCI.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/9007