Skip to content

Quadratic performance on left-associated binds #109

@tomjaguarpaw

Description

@tomjaguarpaw

The problem

Consider walking over a tree using a streaming library to do something at every leaf:

data Tree = Branch Tree Tree | Leaf Int

walkTreeTransformer :: (Monad (m IO), MonadTrans m)
                    => (Int -> IO ()) -> Tree -> m IO ()
walkTreeTransformer doSomething = loop
  where loop = \case
          Leaf i -> lift (doSomething i)
          Branch t1 t2 -> do
            loop t1
            loop t2

On left-skewed trees, i.e. trees that will end up generating left-associated binds, for example those generated by leftSkewed

-- Fully evaluated left-skewed tree
leftSkewed :: Int -> Tree
leftSkewed 0 = Leaf 0
leftSkewed n = (Branch $! leftSkewed (n - 1)) (Leaf n)

the performance of your streaming library is quadratic. In fact performance is quadratic under

(Neither streamly nor conduit have quadratic performance)

The plot below demonstrates the claim. The code to generate the plot is available at https://github.com/tomjaguarpaw/streaming-benchmark. The compiler was GHC 8.10.7.

image

The solution

Firstly, before talking about a solution, do you actually consider this a bug? I do, and I think the risk of hitting unexpected quadratic performance makes this library unsuitable for production use as it is. On the other hand maybe you have a good reason that I shouldn't think that. If so I would be interested to hear it.

If you do think this is a bug then let's think about what can be done. I know of two straightforward options:

  1. Add an explicit constructor for binds and then reassociate binds during elimination
  2. Use Codensity (streamly essentially uses handwritten Codensity.)

In my benchmarks explicit bind comes out slightly ahead.

Perhaps there is another option more suitable for your library.

I welcome your thoughts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions