A Fuzzy Word Similarity Measure for Selecting Top-k Similar Words in Query Expansion

Liu, Q; Huang, H; Xuan, J; Zhang, G; Gao, Y; Lu, J

A Fuzzy Word Similarity Measure for Selecting Top-k Similar Words in Query Expansion

Liu, Q Huang, H Xuan, J

Zhang, G

Gao, Y Lu, J

Permalink

Publisher:: Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:: Journal Article
Citation:: IEEE Transactions on Fuzzy Systems, 2021, 29, (8), pp. 2132-2144
Issue Date:: 2021-01-01

Closed Access

	Filename	Description	Size
	A_Fuzzy_Word_Similarity_Measure_for_Selecting_Top-k_Similar_Words_in_Query_Expansion.pdf	Published version	1.24 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Liu, Q
dc.contributor.author	Huang, H
dc.contributor.author	Xuan, J https://orcid.org/0000-0002-8367-6908
dc.contributor.author	Zhang, G https://orcid.org/0000-0003-3960-0583
dc.contributor.author	Gao, Y
dc.contributor.author	Lu, J https://orcid.org/0000-0003-0690-4732
dc.date.accessioned	2022-02-22T20:29:10Z
dc.date.available	2022-02-22T20:29:10Z
dc.date.issued	2021-01-01
dc.identifier.citation	IEEE Transactions on Fuzzy Systems, 2021, 29, (8), pp. 2132-2144
dc.identifier.issn	1063-6706
dc.identifier.issn	1941-0034
dc.identifier.uri	http://hdl.handle.net/10453/154782
dc.description.abstract	Top-k words selection is a technique used to detect and return the k most similar words to a given word from a candidate set. This is a crucial and widely used tool in various tasks. The key issue in top-k words selection is how to measure the similarity between words. One popular and effective solution is to use a word embedding-based similarity measure, which represents words as low-dimensional vectors and measures the similarities between words according to the similarity of the vectors, using a metric. However, most word embedding methods only consider the local proximity properties of two words in a corpus. To mitigate this issue. In this article, we propose to use association rules for measuring word similarity at a global level, and a fuzzy similarity measure for top-k words selection that jointly encodes the local and the global similarities. Experiments on a real-world query task with three benchmark datasets, i.e., TREC-disk 4&5, WT10G, and RCV1, demonstrate the efficiency of the proposed method compared to several state-of-the-art baselines.
dc.language	English
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation	http://purl.org/au-research/grants/arc/DP170101632
dc.relation.ispartof	IEEE Transactions on Fuzzy Systems
dc.relation.isbasedon	10.1109/tfuzz.2020.2993702
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0102 Applied Mathematics, 0801 Artificial Intelligence and Image Processing, 0906 Electrical and Electronic Engineering
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	A Fuzzy Word Similarity Measure for Selecting Top-k Similar Words in Query Expansion
dc.type	Journal Article
utslib.citation.volume	29
utslib.for	0102 Applied Mathematics
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	0906 Electrical and Electronic Engineering
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	closed_access	*
pubs.consider-herdc	false
dc.date.updated	2022-02-22T20:29:08Z
pubs.issue	8
pubs.publication-status	Published
pubs.volume	29
utslib.citation.issue	8

Abstract:

Top-k words selection is a technique used to detect and return the k most similar words to a given word from a candidate set. This is a crucial and widely used tool in various tasks. The key issue in top-k words selection is how to measure the similarity between words. One popular and effective solution is to use a word embedding-based similarity measure, which represents words as low-dimensional vectors and measures the similarities between words according to the similarity of the vectors, using a metric. However, most word embedding methods only consider the local proximity properties of two words in a corpus. To mitigate this issue. In this article, we propose to use association rules for measuring word similarity at a global level, and a fuzzy similarity measure for top-k words selection that jointly encodes the local and the global similarities. Experiments on a real-world query task with three benchmark datasets, i.e., TREC-disk 4&5, WT10G, and RCV1, demonstrate the efficiency of the proposed method compared to several state-of-the-art baselines.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/154782