[SOLVED] How to use of laziness in Scheme efficiently?

Issue

I am trying to encode a small lambda calculus with algebraic datatypes in Scheme. I want it to use lazy evaluation, for which I tried to use the primitives delay and force. However, this has a large negative impact on the performance of evaluation: the execution time on a small test case goes up by a factor of 20x.

While I did not expect laziness to speed up this particular test case, I did not expect a huge slowdown either. My question is thus: What is causing this huge overhead with lazy evaluation, and how can I avoid this problem while still getting lazy evaluation? I would already be happy to get within 2x the execution time of the strict version, but faster is of course always better.

Below are the strict and lazy versions of the test case I used. The test deals with natural numbers in unary notation: it constructs a sequence of 2^24 sucs followed by a zero and then destructs the result again. The lazy version was constructed from the strict version by adding delay and force in appropriate places, and adding let-bindings to avoid forcing an argument more than once. (I also tried a version where zero and suc were strict but other functions were lazy, but this was even slower than the fully lazy version so I omitted it here.)

I compiled both programs using compile-file in Chez Scheme 9.5 and executed the resulting .so files with petite --program. Execution time (user only) for the strict version was 0.578s, while the lazy version takes 11,891s, which is almost exactly 20x slower.

Strict version

(define zero    'zero)
(define (suc x) (cons 'suc x))

(define one   (suc zero))
(define two   (suc one))
(define three (suc two))

(define (twice m)
  (if (eq? m zero)
      zero
      (suc (suc (twice (cdr m))))))

(define (pow2 m)
  (if (eq? m zero)
      one
      (twice (pow2 (cdr m)))))

(define (consume m)
  (if (eq? m zero)
      zero
      (consume (cdr m))))

(consume (pow2 (twice (twice (twice three)))))

Lazy version

(define zero    (delay 'zero))
(define (suc x) (delay (cons 'suc x)))

(define one   (suc zero))
(define two   (suc one))
(define three (suc two))

(define (twice m)
  (delay
    (let ((mv (force m)))
      (if (eq? mv 'zero)
          (force zero)
          (force (suc (suc (twice (cdr mv)))))))))

(define (pow2 m)
  (delay
    (let ((mv (force m)))
      (if (eq? mv 'zero)
          (force one)
          (force (twice (pow2 (cdr mv))))))))

(define (consume m)
  (delay
    (let ((mv (force m)))
      (if (eq? mv 'zero)
          (force zero)
          (force (consume (cdr mv)))))))

(force (consume (pow2 (twice (twice (twice three))))))

Solution

One can see statistics for the two phases of the test program using ChezScheme’s (time ...) procedure:

$ scheme
Chez Scheme Version 9.5.2
> (load-program "strict.ss")
(time (pow2 (twice (...))))
    21 collections
    0.695561822s elapsed cpu time, including 0.521065634s collecting
    0.695607000s elapsed real time, including 0.521191000s collecting
    672586992 bytes allocated, including 236483824 bytes reclaimed
(time (consume u2^24))
    no collections
    0.037766347s elapsed cpu time
    0.037762000s elapsed real time
    0 bytes allocated

and for the lazy version:

$ scheme
> (load-program "lazy.ss")
(time (pow2 (twice (...))))
    no collections
    0.000000000s elapsed cpu time
    0.000000000s elapsed real time
    400 bytes allocated
(time (force (consume u2^24)))
    572 collections
    11.997971385s elapsed cpu time, including 10.798406971s collecting
    12.012723000s elapsed real time, including 10.813517000s collecting
    4832215216 bytes allocated, including 4460306000 bytes reclaimed

So 90% of time is collecting. Adjusting collector parameters may improve this, eg:

(collect-trip-bytes 1000000)  
(collect-generation-radix (greatest-fixnum))  
(heap-reserve-ratio 2.0)

(these values halve lazy time OMM)

One might also replace ChezScheme’s delay and force with stripped down versions:

(import (except (chezscheme) delay force))
        
(define (make-promise p)
  (let ([value (box p)])
    (lambda ()
      (when (box? value)
        (let ([x ((unbox value))])
          (when (box? value)
            (set! value x))))
      value)))
        
(define-syntax delay
  (syntax-rules ()
    [(_ expr) (make-promise (lambda () expr))]))
    
(define (force promise)
  (promise))

(add above to the beginning of lazy.ss)

NB these have no error checking, and don’t handle multiple values or lazy boxes.
(The ChezScheme implementation is here)

With these changes the lazy version is about 4x slower than strict:

$ scheme
> (load-program "lazy.ss")
(time (pow2 (twice (...))))
    no collections
    0.000000000s elapsed cpu time
    0.000000000s elapsed real time
    336 bytes allocated
(time (force (consume u2^24)))
    3813 collections
    2.977003428s elapsed cpu time, including 2.175818398s collecting
    2.977292000s elapsed real time, including 2.179504000s collecting
    4029652320 bytes allocated, including 2414247968 bytes reclaimed

Answered By – mnemenaut

Answer Checked By – Mary Flores (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *