ghc パターンマッチの時間計算量

この記事は, 旧ブログから移植された記事です. よって, その内容として, 旧ブログに依存した文脈が含まれている可能性があります. 予めご了承下さい.

reddit で見かけて, ふと気になったのでメモ. GCC で C/C++ コードの switch 文および case 節をコンパイルするとき, case 節の数が一定以上を超えると, ジャンプテーブルを利用したアセンブリが吐き出される1. 同様にして, ghc はパターンマッチでジャンプテーブルが用いられる場合がある. 以下, メーリングリスト2から, パターンマッチの時間計算量に関する言及について一部引用.

(snip) To answer you question, O(1) for simple patterns, but it really depends on the complexity of the pattern-matching expression and the Core-to-Core transformations that GHC applies. To truly understand the complexity, you need take a look at the Core/STG dump (I prefer STG since it’s simple). If you have any particular code samples you’d like me to help you analyze, I’d be happy to do so.
A basic example:
data Color = Red | Blue | Green
isRed Red = True
isRed _ = False
GHC will transform this to
isRed x = case x of { Red -True; DEFAULT -False }
You can think of a case as a switch expression in your standard imperative languages. A case expression will evaluate the thunk ‘x’ and perform a switch on the tag of the result. Each data constructor has an integer tag associated with it which will be the target of the switch. So the time complexity of isRed will be the time complexity of thunk evaluation which is impossible to predict because a thunk can be incredibly complex. Lazy evaluation is not so easy to analyze. It is highly context-sensitive.(snip)
The way you’re measuring algorithmic complexity does carry over to the lazy setting provided it’s single-threaded because the order of execution is purely determined by the STG Code. In the concurrent lazy setting, it’s a bit trickier since lightweight locking mechanisms occur when multiple threads evaluate the same thunk, making it non-deterministic.

参考文献


  1. godbolt, gcc-mirror, gcc-mirror,↩︎

  2. time complexity of pattern matching - mail.Haskell, time complexity of pattern matching - mail.Haskell↩︎