Analyzing Searching Behavior in Online Shopping Sites based on Product-Specificity of Query Words

Keywords: Searching Behavior, Online Shopping, Information Content, Log Analysis


As the internet continues to spread as a crucial element of social infrastructure, more and more people are shopping online. Online sites that formerly dealt with such specific products as books and clothing have expanded to mall-type shopping sites by incorporating various kinds of stores. This transition has made product searches by users more complicated and prolonged. In this paper, we propose a method that analyzes the transition patterns of the product-specificity of queries in a product-searching b ehavior. As a key concept in our proposed method, we adopt the notion of information content, which represents the amount of information contained in a query, to quantitatively define product-specificity. We conducted an experiment on an actual shopping log dataset to confirm the effectiveness of our proposed method. The result demonstrates that the proposed method extracts illuminating behavioral patterns such as “narrowing-down behavior” that keeps adding query words and “expanding behavior” that keeps removing query words to increase the search results.


P. Anick, “Using terminological feedback for web search refinement: A log-based study,” Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp. 88–95, 2003.

N. Awaya, J. Kitazono, T. Omori, and S. Ozawa, “Stochastic collapsed variational Bayesian inference for biterm topic model,” Proceedings of the 2016 International Joint Conference on Neural Networks, pp. 3364–3370, 2016.

M. Barathi and V. Shanmugam, “Topic based query suggestion using hidden topic model for effective web search,” Journal of Theoretical and Applied Information Technology, vol. 59, no. 3, pp. 632–642, 2014.

D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet Allocation,” Journal of Machine Learning Research, vol. 3, pp. 993–1022, 2003.

P. Bruza and S. Dennis, “Query reformulation on the internet: Empirical data and the hyperindex search engine,” Computer-Assisted Information Searching on Internet, pp. 488–499, 1997.

P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, and S. Vigna, “The queryflow graph: Model and applications,” Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 609–618, 2008.

I. Bordino, C. Castillo, D. Donato, and A. Gionis, “Query similarity by projecting the query-flow graph,” Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 515–522, 2010.

A. Broder, “A taxonomy of web search,” ACM SIGIR Forum, vol. 36, iss. 2, pp. 3–10, 2002.

Ronan Cummins, “Improved Query-Topic Models Using Pseudo-Relevant Polya Doc- ´ ument Models,” Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval, pp. 101–108, 2017.

L.A. Granka, T. Joachims, and G.K. Gay, “Eye-Tracking Analysis of User Behavior in WWW-Search,” Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 478–479, 2004.

J. Huang and E. N. Efthimiadis, “Analyzing and evaluating query reformulation strategies in web search logs,” Proceedings of the 18th ACM conference on Information and knowledge management, pp. 77–86, 2009.

P. Huang, N. H. Lurie, and S. Mitra, “Searching for Experience on the Web: An Empirical Examination of Consumer Behavior for Search and Experience Goods,” Journal of Marketing, vol. 73, no. 2, pp. 55–69, 2009.

B. J. Jansen, A. Spink, and D. Saracevic, “Real life, real users, and real needs: a study and analysis of user queries on the web,” Information Processing & Management, vol. 36, iss. 2, pp. 207–227, 2000.

B.J. Jansen, D. L. Booth, and A. Spink, “Patterns of query reformulation during web searching,” Journal of the American Society for Information Science and Technology, vol. 60 no. 7, pp. 1358–1371, 2009.

B. J. Jansen and D. Booth, “Classifying Web Queries by Topic and User Intent,” CHI ’10 Extended Abstracts on Human Factors in Computing Systems, pp. 4285–4290, 2010.

J. Jiang and C. Ni, “What Affects Word Changes in Query Reformulation During a Task-based Search Session?,” Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval, pp. 111–120, 2016.

M. Levene and G. Lozou, “Computing the entropy of user navigation in the web,” International Journal of Information Technology and Decision Making, vol. 2, no. 3, pp. 459–476, 2003.

N. Matthijs and F. Radlinski. “Personalizing web search using long term browsing history,” Proceedings of the fourth ACM international conference on Web search and data mining, pp. 25–34, 2011.

L. Milong. “The Analysis of Strengths and Weaknesses of Online-Shopping,“ Proceedings of International Conference on Information and Management Engineering, pp. 457–464, 2011.

W. W. Moe, “Buying, Searching, or Browsing: Differentiating Between Online Shoppers Using In-Store Navigational Clickstream”, Journal of Consumer Psychology, vol. 13, iss. 1-2, pp. 29–39, 2003.

Y. Nozaki and T. Satoh, “Search Log Analysis Method of Online Shopping Sites for Navigating Item Categories,” Proceedings of the 20th International Conference on Information Integration and Web-based Applications & Services, pp. 85–93, 2018.

J. Paparrizos and L. Gravano, “k-Shape: Efficient and Accurate Clustering of Time Series,” Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 1855–1870, 2015.

S. Y. Rieh and H. Xie, “Analysis of multiple query reformulations on the web: The interactive information retrieval context,” Information Processing & Management, vol. 42, no. 3, pp. 751–768, 2006.

D. Schellong, J. Kemper and M. Brettel, “Clickstream Data as a Source to Uncover Consumer Shopping Types in a Large-Scale Online Setting,” Proceedings of the 24th European Conference on Information Systems, 2016.

A. E. Schlosser, T. B. White, and S. M. Lloyd, “Converting Web Site Visitors into Buyers: How Web Site Investment Increases Consumer Trusting Beliefs and Online Purchase Intentions,” Journal of Marketing, vol. 70, no. 2, pp. 133–148, 2006.

P. Sondhi, M. Sharma, P. Kolari and C. Zhai, “A Taxonomy of Queries for Ecommerce Search,” Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1245–1248, 2018.

D. Sontag, K. Collins-Thompson, P. N. Bennett, R. W. White, S. Dumais, and B. Billerbeck. “Probabilistic models for personalizing web search,” Proceedings of the fifth ACM international conference on Web search and data mining, pp. 433–442, 2012.

J. Teevan, S. T. Dumais, and D. J. Liebling. “To personalize or not to personalize: modeling queries with variation in user intent,” Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 163–170, 2008.

J. Teevan, S. T. Dumais and E. Horvitz, “Potential for Personalization,” ACM Trans. Comput.-Hum. Interact., vol. 17, no. 1, pp. 4:1–4:31, 2010.

X. Yan, J. Guo, Y. Lan, and X. Cheng, “A biterm topic model for short texts,” Proceedings of the 22nd international conference on World Wide Web, pp. 1445–1456, 2013.

K. Zhai, Z. Kozareva, Y. Hu, Q. Li and W. Guo, “Query to Knowledge: Unsupervised Entity Extraction from Shopping Queries Using Adaptor Grammars,” Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 255–264, 2016.