Empirical algorithmics

In computer science, empirical algorithmics (or experimental algorithmics) is the practice of using empirical methods to study the behavior of algorithms. The practice combines algorithm development and experimentation: algorithms are not just designed, but also implemented and tested in a variety of situations. In this process, an initial design of an algorithm is analyzed so that the algorithm may be developed in a stepwise manner.[1]

Overview

edit

Methods from empirical algorithmics complement theoretical methods for the analysis of algorithms.[2] Through the principled application of empirical methods, particularly from statistics, it is often possible to obtain insights into the behavior of algorithms such as high-performance heuristic algorithms for hard combinatorial problems that are (currently) inaccessible to theoretical analysis.[3] Empirical methods can also be used to achieve substantial improvements in algorithmic efficiency.[4]

American computer scientist Catherine McGeoch identifies two main branches of empirical algorithmics: the first (known as empirical analysis) deals with the analysis and characterization of the behavior of algorithms, and the second (known as algorithm design or algorithm engineering) is focused on empirical methods for improving the performance of algorithms.[5] The former often relies on techniques and tools from statistics, while the latter is based on approaches from statistics, machine learning and optimization. Dynamic analysis tools, typically performance profilers, are commonly used when applying empirical methods for the selection and refinement of algorithms of various types for use in various contexts.[6][7][8]

Research in empirical algorithmics is published in several journals, including the ACM Journal on Experimental Algorithmics (JEA) and the Journal of Artificial Intelligence Research (JAIR). Besides Catherine McGeoch, well-known researchers in empirical algorithmics include Bernard Moret, Giuseppe F. Italiano, Holger H. Hoos, David S. Johnson, and Roberto Battiti.[9]

Performance profiling in the design of complex algorithms

edit

In the absence of empirical algorithmics, analyzing the complexity of an algorithm can involve various theoretical methods applicable to various situations in which the algorithm may be used.[10] Memory and cache considerations are often significant factors to be considered in the theoretical choice of a complex algorithm, or the approach to its optimization, for a given purpose.[11][12] Performance profiling is a dynamic program analysis technique typically used for finding and analyzing bottlenecks in an entire application's code[13][14][15] or for analyzing an entire application to identify poorly performing code.[16] A profiler can reveal the code most relevant to an application's performance issues.[17]

A profiler may help to determine when to choose one algorithm over another in a particular situation.[18] When an individual algorithm is profiled, as with complexity analysis, memory and cache considerations are often more significant than instruction counts or clock cycles; however, the profiler's findings can be considered in light of how the algorithm accesses data rather than the number of instructions it uses.[19]

Profiling may provide intuitive insight into an algorithm's behavior[20] by revealing performance findings as a visual representation.[21] Performance profiling has been applied, for example, during the development of algorithms for matching wildcards. Early algorithms for matching wildcards, such as Rich Salz' wildmat algorithm,[22] typically relied on recursion, a technique criticized on grounds of performance.[23] The Krauss matching wildcards algorithm was developed based on an attempt to formulate a non-recursive alternative using test cases[24] followed by optimizations suggested via performance profiling,[25] resulting in a new algorithmic strategy conceived in light of the profiling along with other considerations.[26] Profilers that collect data at the level of basic blocks[27] or that rely on hardware assistance[28] provide results that can be accurate enough to assist software developers in optimizing algorithms for a particular computer or situation.[29] Performance profiling can aid developer understanding of the characteristics of complex algorithms applied in complex situations, such as coevolutionary algorithms applied to arbitrary test-based problems, and may help lead to design improvements.[30]

See also

edit

References

edit
  1. ^ Fleischer, Rudolf; et al., eds. (2002). Experimental Algorithmics, From Algorithm Design to Robust and Efficient Software. Springer International Publishing AG.
  2. ^ Moret, Bernard M. E. (1999). Towards A Discipline Of Experimental Algorithmics. DIMACS Series in Discrete Mathematics and Theoretical Computer Science. Vol. 59. DIMACS Series in Discrete Mathematics and Theoretical Computer Science. pp. 197–213. doi:10.1090/dimacs/059/10. ISBN 9780821828922. S2CID 2221596.
  3. ^ Hromkovic, Juraj (2004). Algorithmics for Hard Problems. Springer International Publishing AG.
  4. ^ Guzman, John Paul; Limoanco, Teresita (2017). "An Empirical Approach to Algorithm Analysis Resulting in Approximations to Big Theta Time Complexity" (PDF). Journal of Software. 12 (12).
  5. ^ McGeoch, Catherine (2012). A Guide to Experimental Algorithmics. Cambridge University Press. ISBN 978-1-107-00173-2.
  6. ^ Coppa, Emilio; Demetrescu, Camil; Finocchi, Irene (2014). "Input-Sensitive Profiling". IEEE Transactions on Software Engineering. 40 (12): 1185–1205. CiteSeerX 10.1.1.707.4347. doi:10.1109/TSE.2014.2339825.
  7. ^ Moret, Bernard M. E.; Bader, David A.; Warnow, Tandy (2002). "High-Performance Algorithm Engineering for Computational Phylogenetics" (PDF). The Journal of Supercomputing. 22 (1): 99–111. doi:10.1023/a:1014362705613. S2CID 614550.
  8. ^ Zaparanuks, Dmitrijs; Hauswirth, Matthias (2012). Algorithmic Profiling. 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM Digital Library. pp. 67–76. CiteSeerX 10.1.1.459.4913.
  9. ^ "On experimental algorithmics: an interview with Catherine McGeoch and Bernard Moret". Ubiquity. 2011 (August). ACM Digital Library. 2011.
  10. ^ Grzegorz, Mirek (2018). "Big-O Ambiguity". performant code_.
  11. ^ Kölker, Jonas (2009). "When does Big-O notation fail?". Stack Overflow.
  12. ^ Lemire, Daniel (2013). "Big-O notation and real-world performance". WordPress.
  13. ^ "Finding Application Bottlenecks". dotTrace 2018.1 Help. JetBrains. 2018.
  14. ^ Shmeltzer, Shay (2005). "Locating Bottlenecks in Your Code with the Event Profiler". Oracle Technology Network JDeveloper documentation. Oracle Corp.
  15. ^ Shen, Du; Poshyvanyk, Denys; Luo, Qi; Grechanik, Mark (2015). "Automating performance bottleneck detection using search-based application profiling" (PDF). Proceedings of the 2015 International Symposium on Software Testing and Analysis. ACM Digital Library. pp. 270–281. doi:10.1145/2771783.2771816. ISBN 9781450336208. S2CID 8625903.
  16. ^ "Performance & Memory Profiling and Code Coverage". The Profile Learning Center. SmartBear Software. 2018.
  17. ^ Janssen, Thorben (2017). "11 Simple Java Performance Tuning Tips". Stackify Developer Tips, Tricks and Resources.
  18. ^ O'Sullivan, Bryan; Stewart, Don; Goerzen, John (2008). "25. Profiling and optimization". Real World Haskell. O'Reilly Media.
  19. ^ Linden, Doug (2007). "Profiling and Optimization". Second Life Wiki.
  20. ^ Pattis, Richard E. (2007). "Analysis of Algorithms, Advanced Programming/Practicum, 15-200". School of Computer Science, Carnegie Mellon University.
  21. ^ Wickham, Hadley (2014). "Optimising code". Advanced R. Chapman and Hall/CRC.
  22. ^ Salz, Rich (1991). "wildmat.c". GitHub.
  23. ^ Cantatore, Alessandro (2003). "Wildcard matching algorithms".
  24. ^ Krauss, Kirk (2008). "Matching Wildcards: An Algorithm". Dr. Dobb's Journal.
  25. ^ Krauss, Kirk (2014). "Matching Wildcards: An Empirical Way to Tame an Algorithm". Dr. Dobb's Journal.
  26. ^ Krauss, Kirk (2018). "Matching Wildcards: An Improved Algorithm for Big Data". Develop for Performance.
  27. ^ Grehan, Rick (2005). "Code Profilers: Choosing a Tool for Analyzing Performance" (PDF). Freescale Semiconductor. If, on the other hand, you need to step through your code with microscopic accuracy, fine-tuning individual machine instructions, then an active profiler with cycle-counting cannot be beat.
  28. ^ Hough, Richard; et al. (2006). "Cycle-Accurate Microarchitecture Performance Evaluation". Proceedings of Workshop on Introspective Architecture. Georgia Institute of Technology. CiteSeerX 10.1.1.395.9306.
  29. ^ Khamparia, Aditya; Banu, Saira (2013). Program Analysis with Dynamic Instrumentation Pin and Performance Tools. IEEE International conference on Emerging trends in Computing, Communication and Nanotechnology. IEEE Xplore Digital Library.
  30. ^ Jaskowski, Wojciech; Liskowski, Pawel; Szubert, Marcin Grzegorz; Krawiec, Krzysztof (2016). "The performance profile: A multi-criteria performance evaluation method for test-based problems" (PDF). Applied Mathematics and Computer Science. 26. De Gruyter: 216.