Less Is More: Rejecting Unreliable Reviews for Product Question Answering

Zhang, Shiwei; Zhang, Xiuzhen; Lau, Jey Han; Chan, Jeffrey; Paris, Cecile

doi:10.1007/978-3-030-67664-3_34

Shiwei Zhang^12,14,
Xiuzhen Zhang¹²,
Jey Han Lau¹³,
Jeffrey Chan¹² &
…
Cecile Paris¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12459))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

1784 Accesses
1 Citations

Abstract

Promptly and accurately answering questions on products is important for e-commerce applications. Manually answering product questions (e.g. on community question answering platforms) results in slow response and does not scale. Recent studies show that product reviews are a good source for real-time, automatic product question answering (PQA). In the literature, PQA is formulated as a retrieval problem with the goal to search for the most relevant reviews to answer a given product question. In this paper, we focus on the issue of answerability and answer reliability for PQA using reviews. Our investigation is based on the intuition that many questions may not be answerable with a finite set of reviews. When a question is not answerable, a system should return nil answers rather than providing a list of irrelevant reviews, which can have significant negative impact on user experience. Moreover, for answerable questions, only the most relevant reviews that answer the question should be included in the result. We propose a conformal prediction based framework to improve the reliability of PQA systems, where we reject unreliable answers so that the returned results are more concise and accurate at answering the product question, including returning nil answers for unanswerable questions. Experiments on a widely used Amazon dataset show encouraging results of our proposed framework. More broadly, our results demonstrate a novel and effective application of conformal methods to a retrieval task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.amazon.com/.
2.
https://world.taobao.com/.
3.
https://www.mturk.com/.
4.
https://github.com/zswvivi/ecml_pqa.
5.
https://cseweb.ucsd.edu/~jmcauley.
6.
https://github.com/zswvivi/icdm_pqa.
7.
The original implementation uses a softmax activation function to compute P(r|q) (and so the probability of all reviews sum up to one); we make a minor modification to the softmax function and use a sigmoid function instead (and so each review produces a valid probability distribution over the positive and negative classes).
8.
Following the original papers, a “review” is technically a “review sentence” rather than the full review.
9.
To control for quality, we insert a control question with a known answer (from the QA pair) in every 3 questions. Workers who consistently give low scores to these control questions are filtered out.
10.
This step is only needed for moqa, as bertqa and fltr produce probabilities in the first place. For moqa, we convert the review score into a probability applying a sigmoid function to the log score.

References

McAuley, J., Yang, A.: Addressing complex and subjective product-related queries with customer reviews. In: WWW (2016)
Google Scholar
Zhao, J., Guan, Z., Sun, H.: Riker: mining rich keyword representations for interpretable product question answering. In: SIGKDD (2019)
Google Scholar
Zhang, S., Lau, J.H., Zhang, X., Chan, J., Paris, C.: Discovering Relevant Reviews for Answering Product-related Queries. In: ICDM (2019)
Google Scholar
Gao, S., Ren, Z., et al.: Product-aware answer generation in e-commerce question-answering. In: WSDM (2019)
Google Scholar
Chen, S., Li, C., et al.: Driven answer generation for product-related questions in e-commerce. In: WSDM (2019)
Google Scholar
Rajpurkar, P., Jia, R., Liang, P.: Know what you don’t know: unanswerable questions for SQuAD. In: ACL (2018)
Google Scholar
Herbei, R., Wegkamp, M.H.: Classification with reject option. The Canadian Journal of Statistics/La Revue Canadienne de Statistique (2006)
Google Scholar
Gammerman, A.: Conformal Predictors for Reliable Pattern Recognition. In: Computer Data Analysis and Modeling: Stochastics and Data Science (2019)
Google Scholar
Vovk, V., Gammerman, A., Shafer, G.: Algorithmic Learning in a Random World. Springer, New York (2005)
MATH Google Scholar
Shafer, G., Vovk, V.: A tutorial on conformal prediction. J. Mach. Learn. Res. 9, 371–421 (2008)
MathSciNet MATH Google Scholar
Toccaceli, P., Gammerman, A.: Combination of inductive mondrian conformal predictors. Mach. Learn. 108(3), 489–510 (2018). https://doi.org/10.1007/s10994-018-5754-9
Article MathSciNet MATH Google Scholar
Carlsson, L., Bendtsen, C., Ahlberg, E.: Comparing performance of different inductive and transductive conformal predictors relevant to drug discovery. In: Conformal and Probabilistic Prediction and Applications (2017)
Google Scholar
Cortes-Ciriano, I., Bender, A.: Reliable prediction errors for deep neural networks using test-time dropout. J. Chem. Inf. Model. 59(7), 3330–3339 (2019)
Article Google Scholar
Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E.: Adaptive mixtures of local experts. Neural Comput. 3(1), 79–87 (1991)
Article Google Scholar
Devlin, J., Chang, M.W., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL (2019)
Google Scholar
Gupta, M., Kulkarni, N., Chanda, R., et al.: AmazonQA: a review-based question answering task. In: IJCAI (2019)
Google Scholar
Hu, M., Wei, F., Peng, Y., et al.: Read+ verify: machine reading comprehension with unanswerable questions. In: AAAI (2019)
Google Scholar
Sun, F., Li, L., et al.: U-net: machine reading comprehension with unanswerable questions (2018)
Google Scholar
Godin, F., Kumar, A., Mittal, A.: Learning when not to answer: a ternary reward structure for reinforcement learning based question answering. In: NAACL-HLT (2019)
Google Scholar
Huang, K., Tang, Y., Huang, J., He, X., Zhou, B.: Relation module for non-answerable predictions on reading comprehension. In: CoNLL (2019)
Google Scholar
Joshi, M., Choi, E., Weld, D.S., Zettlemoyer, L.: TriviaQA: a large scale distantly supervised challenge dataset for reading comprehension. In: ACL (2017)
Google Scholar
Dunn, M., Sagun, L., Higgins, M., Guney, V.U., Cirik, V., Cho, K.: Searchqa: a new qa dataset augmented with context from a search engine (2017)
Google Scholar
Su, L., Guo, J., Fan, Y., Lan, Y., Cheng, X.: Controlling risk of web question answering. In: SIGIR (2019)
Google Scholar
Sun, J., Carlsson, L., Ahlberg, E., et al.: Applying mondrian cross-conformal prediction to estimate prediction confidence on large imbalanced bioactivity data sets. J. Chem. Inf. Model. 57(7), 1591–1598 (2017)
Google Scholar
Card, D., Zhang, M., Smith, N.A.: Deep weighted averaging classifiers. In: Proceedings of the Conference on Fairness, Accountability, and Transparency (2019)
Google Scholar
Gal, Y., Ghahramani, Z.: Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In: ICML (2016)
Google Scholar
Liu, F., Moffat, A., Baldwin, T., Zhang, X.: Quit while ahead: Evaluating truncated rankings. In: SIGIR (2016)
Google Scholar
Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. (TOIS) 20(4), 422–446 (2002)
Article Google Scholar
Kubat, M., Holte, R.C., Matwin, S.: Machine learning for the detection of oil spills in satellite radar images. Mach. Learn. 30, 195–215 (1998). https://doi.org/10.1023/A:1007452223027

Download references

Acknowledgement

Shiwei Zhang is supported by the RMIT University and CSIRO Data61 Scholarships.

Author information

Authors and Affiliations

RMIT University, Melbourne, Australia
Shiwei Zhang, Xiuzhen Zhang & Jeffrey Chan
The University of Melbourne, Melbourne, Australia
Jey Han Lau
CSIRO Data61, Sydney, Australia
Shiwei Zhang & Cecile Paris

Authors

Shiwei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiuzhen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jey Han Lau
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey Chan
View author publications
You can also search for this author in PubMed Google Scholar
Cecile Paris
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiuzhen Zhang .

Editor information

Editors and Affiliations

Albert-Ludwigs-Universität, Freiburg, Germany
Frank Hutter
TU Darmstadt, Darmstadt, Germany
Kristian Kersting
Ghent University, Ghent, Belgium
Jefrey Lijffijt
Saarland University, Saarbrücken, Germany
Isabel Valera

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, S., Zhang, X., Lau, J.H., Chan, J., Paris, C. (2021). Less Is More: Rejecting Unreliable Reviews for Product Question Answering. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12459. Springer, Cham. https://doi.org/10.1007/978-3-030-67664-3_34

Download citation

DOI: https://doi.org/10.1007/978-3-030-67664-3_34
Published: 25 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67663-6
Online ISBN: 978-3-030-67664-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)