Revisiting the Algorithm RBSC-SubGen:

Exploring its Limits and Improving its Convergence

Keywords: Simple difference formula, subset selection problem, rank-biserial correlation

Abstract

This study focuses on the algorithm RBSC-SubGen, which is originally offered for deck generation. We first study the resilience of RBSC-SubGen against various hyper-parameters by testing it under various constraints. Our results indicate that RBSC-SubGen is sensitive to subset size S, followed by desired RBSC coefficient ρ∗, permissible disparity ε and, finally, parent set size L. We then offer a simple modification for reducing the computational load of RBSC-SubGen and avoiding saturation. In particular, we offer to carry an intermediate level verification of the desired conditions. Additional tests show that the rate of saturation and number of iterations decrease, whereas the disparity in the RBSC coefficient increases within acceptable limits.

References

[1] L. J. Hong,W. Fan, and J. Luo, “Review on ranking and selection: A new perspective,” arXiv preprint arXiv:2008.00249, 2020.
[2] T. Foss, E. Stensrud, B. Kitchenham, and I. Myrtveit, “A simulation study of the model evaluation criterion MMRE,” IEEE Tran. Software Engineering, vol. 29, no. 11, pp. 985–995, 2003.
[3] D. J. Eckman, M. Plumlee, and B. L. Nelson, “Revisiting subset selection,” in Proc. Winter Simulation Conf., pp. 2972–2983, IEEE, 2020.
[4] Z. Y¨ucel, P. Supitayakul, A. Monden, and P. Leelaprute, “An algorithm for automatic collation of vocabulary decks based on word frequency,” IEICE Tran. Information and Systems, vol. 103, no. 8, pp. 1865–1874, 2020.
[5] S. S. Gupta, “On some multiple decision (selection and ranking) rules,” Technometrics, vol. 7, no. 2, pp. 225–245, 1965.
[6] L. Huang, J. Wei, and E. Celis, “Towards just, fair and interpretable methods for judicial subset selection,” in Proc. AAAI/ACM Conf. AI, Ethics, and Society, pp. 293–299, 2020.
[7] G. C. McDonald, “Applications of subset selection procedures and Bayesian ranking methods in analysis of traffic fatality data,” Computational Statistics, vol. 8, no. 6, pp. 222–237, 2016.
[8] C.-H. Yeh, B. A. Barsky, and M. Ouhyoung, “Personalized photograph ranking and selection system considering positive and negative user feedback,” ACM Tran. Multimedia Computing, Communications, and Applications, vol. 10, no. 4, pp. 1–20, 2014.
[9] C.-H. Chen, S. E. Chick, L. H. Lee, and N. A. Pujowidianto, “Ranking and selection: Efficient simulation budget allocation,” Handbook of Simulation Optimization, pp. 45–80, 2015.
[10] S. H. Choi and T. G. Kim, “A heuristic approach for selecting best-subset including ranking within the subset,” IEEE Tran. Systems, Man, and Cybernetics-A, vol. 50, no. 10, pp. 3852–3862, 2018.
[11] M. H. Alrefaei and M. Almomani, “Subset selection of best simulated systems,” Journal of the Franklin Institute, vol. 344, no. 5, pp. 495–506, 2007.
[12] Y. Wang, L. Luangkesorn, and L. J. Shuman, “Best-subset selection procedure,” in Proc. Winter Simulation Conf., pp. 4310–4318, IEEE, 2011.
[13] S. Gao, H. Xiao, E. Zhou, and W. Chen, “Robust ranking and selection with optimal computing budget allocation,” Automatica, vol. 81, pp. 30–36, 2017.
[14] M. Mitchell, D. Baker, N. Moorosi, E. Denton, B. Hutchinson, A. Hanna, T. Gebru, and J. Morgenstern, “Diversity and inclusion metrics in subset selection,” in Proc. AAAI/ACM Conf. AI, Ethics, and Society, pp. 117–123, 2020.
[15] S. Gao and W. Chen, “A note on the subset selection for simulation optimization,” in Proc. Winter Simulation Conf., pp. 3768–3776, IEEE, 2015.
[16] C.-H. Chen, D. He, M. Fu, and L. H. Lee, “Efficient simulation budget allocation for selecting an optimal subset,” INFORMS Journal on Computing, vol. 20, no. 4, pp. 579–595, 2008.
[17] M. J. Groves, Efficient Pairwise Information Collection for Subset Selection. PhD thesis, University of Warwick, 2020.
[18] L. J. Hong, J. Luo, and Y. Zhong, “Speeding up pairwise comparisons for large scale ranking and selection,” in Proc. Winter Simulation Conf., pp. 749–757, IEEE, 2016.
[19] D. S. Kerby, “The simple difference formula: An approach to teaching nonparametric correlation,” Comprehensive Psychology, vol. 3, pp. 11–IT, 2014.
[20] Z. Y¨ucel, “Solving the subset generation problem with RBSC-SubGen.” https://github.com/yucelzeynep/Subset-generation-with-RBSC-SubGen, 2021. [Accessed 2022-03-18].
Published
2023-10-07
Section
Technical Papers (Advanced Applied Informatics)