-
Notifications
You must be signed in to change notification settings - Fork 31
Description
The problem
Consider walking over a tree using a streaming library to do something at every leaf:
data Tree = Branch Tree Tree | Leaf Int
walkTreeTransformer :: (Monad (m IO), MonadTrans m)
=> (Int -> IO ()) -> Tree -> m IO ()
walkTreeTransformer doSomething = loop
where loop = \case
Leaf i -> lift (doSomething i)
Branch t1 t2 -> do
loop t1
loop t2On left-skewed trees, i.e. trees that will end up generating left-associated binds, for example those generated by leftSkewed
-- Fully evaluated left-skewed tree
leftSkewed :: Int -> Tree
leftSkewed 0 = Leaf 0
leftSkewed n = (Branch $! leftSkewed (n - 1)) (Leaf n)the performance of your streaming library is quadratic. In fact performance is quadratic under
(Neither streamly nor conduit have quadratic performance)
The plot below demonstrates the claim. The code to generate the plot is available at https://github.com/tomjaguarpaw/streaming-benchmark. The compiler was GHC 8.10.7.
The solution
Firstly, before talking about a solution, do you actually consider this a bug? I do, and I think the risk of hitting unexpected quadratic performance makes this library unsuitable for production use as it is. On the other hand maybe you have a good reason that I shouldn't think that. If so I would be interested to hear it.
If you do think this is a bug then let's think about what can be done. I know of two straightforward options:
- Add an explicit constructor for binds and then reassociate binds during elimination
- Use
Codensity(streamly essentially uses handwrittenCodensity.)
In my benchmarks explicit bind comes out slightly ahead.
Perhaps there is another option more suitable for your library.
I welcome your thoughts.
