Skip to main content
Log in

Computational performance evaluation of two integer linear programming models for the minimum common string partition problem

  • Original Paper
  • Published:
Optimization Letters Aims and scope Submit manuscript

Abstract

In the minimum common string partition (MCSP) problem two related input strings are given. “Related” refers to the property that both strings consist of the same set of letters appearing the same number of times in each of the two strings. The MCSP seeks a minimum cardinality partitioning of one string into non-overlapping substrings that is also a valid partitioning for the second string. This problem has applications in bioinformatics e.g. in analyzing related DNA or protein sequences. For strings with lengths less than about 1000 letters, a previously published integer linear programming (ILP) formulation yields, when solved with a state-of-the-art solver such as CPLEX, satisfactory results. In this work, we propose a new, alternative ILP model that is compared to the former one. While a polyhedral study shows the linear programming relaxations of the two models to be equally strong, a comprehensive experimental comparison using real-world as well as artificially created benchmark instances indicates substantial computational advantages of the new formulation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. http://www-01.ibm.com/software/commerce/optimization/cplex-optimizer.

  2. Note that we confirmed, in this context, that in all cases the values of the LP relaxations concerning \(\textsc {Ilp}_{\mathrm {cb}}\) and \(\textsc {Ilp}_{\mathrm {cs}}\) were equal.

References

  1. Blum, C., Lozano, J.A., Pinacho Davidson, P.: Iterative probabilistic tree search for the minimum common string partition problem. In: Blesa, M.J., Blum, C., Voss, S. (eds.) Proceedings of HM 20104—9th International Workshop on Hybrid Metaheuristics. Lecture Notes in Computer Science, vol. 8457, pp. 154–154. Springer, Berlin (2014)

  2. Blum, C., Lozano, J.A., Pinacho Davidson, P.: Mathematical programming strategies for solving the minimum common string partition problem. Eur. J. Oper. Res. 242(3), 769–777 (2015)

    Article  Google Scholar 

  3. Chen, X., Zheng, J., Fu, Z., Nan, P., Zhong, Y., Lonardi, S., Jiang, T.: Assignment of orthologous genes via genome rearrangement. IEEE/ACM Trans. Comput. Biol. Bioinform. 2(4), 302–315 (2005)

    Article  Google Scholar 

  4. Chrobak, M., Kolman, P., Sgall, J.: The greedy algorithm for the minimum common string partition problem. In: Jansen, K., Khanna, S., Rolim, J.D.P., Ron, D. (eds.) Proceedings of APPROX 2004—7th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems. Lecture Notes in Computer Science, vol. 3122, pp. 84–95. Springer, Berlin (2004)

  5. Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. ACM Trans. Algorithms 3(2), 1–19 (2007)

    Article  MathSciNet  Google Scholar 

  6. Damaschke, P.: Minimum common string partition parameterized. In: Crandall, K.A., Lagergren, J. (eds.) Proceedings of WABI 2008—8th International Workshop on Algorithms in Bioinformatics. Lecture Notes in Computer Science, vol. 5251, pp. 87–98. Springer, Berlin (2008)

  7. Ferdous, S.M., Sohel Rahman, M.: Solving the minimum common string partition problem with the help of ants. In: Tan, Y., Shi, Y., Mo, H. (eds.) Proceedings of ICSI 2013—4th International Conference on Advances in Swarm Intelligence. Lecture Notes in Computer Science, vol. 7928, pp. 306–313. Springer, Berlin (2013)

  8. Ferdous, S.M., Sohel Rahman, M.: A MAX–MIN ant colony system for minimum common string partition problem (2014). arXiv:1401.4539

  9. Fu, B., Jiang, H., Yang, B., Zhu, B.: Exponential and polynomial time algorithms for the minimum common string partition problem. In: Wang, W., Zhu, X., Du, D.Z. (eds.) Proceedings of COCOA 2011—5th International Conference on Combinatorial Optimization and Applications. Lecture Notes in Computer Science, vol. 6831, pp. 299–310. Springer, Berlin (2011)

    Chapter  Google Scholar 

  10. Gallardo, J.E.: A multilevel probabilistic beam search algorithm for the shortest common supersequence problem. PLoS One 7(12) (2012)

  11. Garey, M.R., Johnson, D.S.: Computers and intractability; a guide to the theory of NP-completeness. W. H. Freeman, San Francisco (1979)

  12. Goldstein, A., Kolman, P., Zheng, J.: Minimum common string partition problem: Hardness and approximations. In: Fleischer, R., Trippen, G. (eds.) Proceedings of ISAAC 2004—15th International Symposium on Algorithms and Computation. Lecture Notes in Computer Science, vol. 3341, pp. 484–495. Springer, Berlin (2005)

    Google Scholar 

  13. Goldstein, I., Lewenstein, M.: Quick greedy computation for minimum common string partitions. In: Giancarlo, R., Manzini, G. (eds.) Proceedings of CPM 2011—22nd Annual Symposium on Combinatorial Pattern Matching. Lecture Notes in Computer Science, vol. 6661, pp. 273–284. Springer, Berlin (2011)

    Chapter  Google Scholar 

  14. He, D.: A novel greedy algorithm for the minimum common string partition problem. In: Mandoiu, I., Zelikovsky, A. (eds.) Proceedings of ISBRA 2007—Third International Symposium on Bioinformatics Research and Applications. Lecture Notes in Computer Science, vol. 4463, pp. 441–452. Springer, Berlin (2007)

    Chapter  Google Scholar 

  15. Hsu, W.J., Du, M.W.: Computing a longest common subsequence for a set of strings. BIT Numer. Math. 24(1), 45–59 (1984). doi:10.1007/BF01934514

  16. Jiang, H., Zhu, B., Zhu, D., Zhu, H.: Minimum common string partition revisited. J. Comb. Optim. 23(4), 519–527 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  17. Kaplan, H., Shafrir, N.: The greedy algorithm for edit distance with moves. Inf. Process. Lett. 97(1), 23–27 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  18. Kolman, P.: Approximating reversal distance for strings with bounded number of duplicates. In: Jedrzejowicz, J., Szepietowski, A. (eds.) Proceedings of MFCS 2005—30th International Symposium on Mathematical Foundations of Computer Science. Lecture Notes in Computer Science, vol. 3618, pp. 580–590. Springer, Berlin (2005)

    Chapter  Google Scholar 

  19. Kolman, P., Waleń, T.: Reversal distance for strings with duplicates: linear time approximation using hitting set. In: Erlebach, T., Kaklamanis, C. (eds.) Proceedings of WAOA 2007—4th International Workshop on Approximation and Online Algorithms. Lecture Notes in Computer Science, vol. 4368, pp. 279–289. Springer, Berlin (2007)

    Chapter  Google Scholar 

  20. Meneses, C., Oliveira, C., Pardalos, P.: Optimization techniques for string selection and comparison problems in genomics. IEEE Eng. Med. Biol. Mag. 24(3), 81–87 (2005)

    Article  Google Scholar 

  21. Mousavi, S., Babaie, M., Montazerian, M.: An improved heuristic for the far from most strings problem. J. Heuristics 18, 239–262 (2012)

    Article  Google Scholar 

  22. Shapira, D., Storer, J.A.: Edit distance with move operations. In: Apostolico, A., Takeda, M. (eds.) Proceedings of CPM 2002—13th Annual Symposium on Combinatorial Pattern Matching. Lecture Notes in Computer Science, vol. 2373, pp. 85–98. Springer, Berlin (2002)

    Chapter  Google Scholar 

  23. Smith, T., Waterman, M.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)

    Article  Google Scholar 

Download references

Acknowledgments

C. Blum acknowledges support by grant TIN2012-37930-02 of the Spanish Government. In addition, support is acknowledged from IKERBASQUE (Basque Foundation for Science). Our experiments have been executed in the High Performance Computing environment managed by RDlab (http://rdlab.lsi.upc.edu) and we would like to thank them for their support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Blum.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Blum, C., Raidl, G.R. Computational performance evaluation of two integer linear programming models for the minimum common string partition problem. Optim Lett 10, 189–205 (2016). https://doi.org/10.1007/s11590-015-0921-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11590-015-0921-4

Keywords

Navigation