Quirksand

SICP

SICP 3.1.2 Solutions

June 11, 2017 10:50

Exercise 3.5.

This exercise has essentially two parts: the integration procedure itself, and then the means of applying it (which will also test that it functions correctly). I’ll discuss them separately.

Since we are using the Monte Carlo approach for integration, we first have to set up the parameters of a trial. We are given a hit/miss test, which is a procedure that returns true or false depending on whether the point is contained under the function or not. We pick random points using random-in-range with the range being x- and y-values of the bounding box. Then we run that through monte-carlo. The result will give us the fraction of points that were ‘hits’ in the defined region. To get the estimated area of our function, we multiply that fractional result by the total bounded area, which is just that of the rectangle defined by the corner points. As we test more points, the ratio of the valid points to the total points tested by the Monte Carlo trials approaches the ratio of the function’s area to the area of the bounding box, which is the desired integration result.

(define (estimate-integral P x1 x2 y1 y2 trials)
  (let ((total-area (* (- x2 x1) (- y2 y1)))
        (hit-fraction (monte-carlo trials (lambda () 
                                            (P (random-in-range x1 x2)
                                               (random-in-range y1 y2)
                                               )
                                            )
                                   )
                      )
        )
    (* hit-fraction total-area)
    )
  )

The application/testing portion is asking us to estimate the value of π by using the unit circle. Since our integrator simply measures the area given certain constraints, and π is part of the formula for the area of a circle, all we need to determine π is to run our integrator over a circle, and divide out the non-π portion of the area formula (i.e. radius squared). The bounding box can be set by adding and subtracting one radius length to the center of the circle, as that gives us a square that tightly encloses our circle.

We then need a proper predicate to test whether a point is within a circle. Given a point (x, y), we can determine whether it’s in a circle by checking to see if the point’s distance from the circle’s center is less than the radius of the circle. I wrote a generic procedure that can accept circles of any radius centered on other locations (though the variables are set external to the test itself). This does mean there is more computation involved for the test, but the procedure runs fast enough for me that the extra time is barely noticeable.

(define circleP (lambda (x y)
                  (< (+ (sqr (- x xc)) (sqr (- y yc))) r_squared) 
                  )
  )

This is a simple application of the distance formula, but to save time in the calculation, the value is compared to the square of the radius instead of taking the square root of the other side of the equation.

In testing, I added a few variations to see what effect it would have on the estimation. One thing that can change is the size of the circle. This doesn’t have a noticeable effect on how many trials it takes to get an accurate estimate. That make sense, since all that happens is both the estimated value and the expected value are scaled by some amount (although if you took this too far, at some point you could lose precision using floating point numbers, so you are in theory better off with a unit circle).

The other variation was to change the size of the bounding box relative to the circle. This results in a typically less accurate estimate for the same number of trials, when compared to the tighter bounding box. It’s not too surprising to see this result either. The Monte Carlo test works better as the proportion of hits approaches the actual proportion of the areas. With a lower chance to hit (because of a smaller ‘target’ area), the ratio of hits is going to be computed with fewer successes, and the expected error will be greater.

Exercise 3.6.

Structurally, the random generator first needs to determine whether it is generating or in reset mode. It can be structured similarly to the bank account procedure of Section 3.1.1, with dispatch to locally defined functions depending on the argument. There is a crucial difference, however — rand is called with the command directly, at least for generating values.

In the back account example, the internal state variable was defined within make-account, which returned a procedure that could be called with the commands; those actions were limited solely to that account (i.e. procedure). We have to come up with our own approach here. The ‘internal state variable’ referred to in the text needs to be defined separately if calls to rand are to work as specified.

One way would be to copy the bank account procedure, and create a pseudo-random number generator generator (and I think it’d be good to shorten the first part to PRNG, so we can call it a PRNG Generator). Then rand would just send calls to itself to the generated PRNG. That is what I did in my first solution, but to demonstrate an alternative way of doing things, I included another approach (I call these ‘Method 1’ and ‘Method 2’ in the solution file). In Method 2, I used an external (global) variable, which rand relies on. While that is simpler, the first way is arguably better, since whatever is keeping track of the random sequence is confined only to the rand function, and can’t be modified by anything else.

The definition of rand then resembles the definition of acc in the previous section — given a procedure make-prng that returns a procedure, rand is named using that procedure. Internally, calls to generate and reset are handled slightly differently, since once returns a value, and the other returns a procedure that accepts the new value; this is taken care of within the dispatch portion.

As for what is actually stored in the internal variable, it is the last random value, which can be sent to a procdure like rand-update that will generate the next value. In my version, there is an internal rand-update, whose parameters are determined at the time make-prng is called. Those paramters are used by a LCG (Linear Congruential Generator), the method mentioned in the text’s footnote. While it’s not a great means for simulating randomness, it is fairly easy to implement and work with. Here’s a brief explanation of the formula, which also includes the requirements to make a ‘good’ sequence by choosing the right parameters.

My second approach, as mentioned, uses an external variable. It also creates the PRNG in a different way. It turns out that Racket/Pretty Big actually has a PRNG Generator already, and it’s called make-pseudo-random-number-generator. Calls to random (the built-in function) can also specify which PRNG to use by adding it as an optional final argument. Other Scheme implementations may have a similar procedure with an alternate name (MIT Scheme, for example, has make-random-state).

To reset the state of our PRNG, we can use the built-in random-seed function and give it the reset value. I initially took this approach. Calls to reset provide a new (integer) value for the seed. The actual implementation in Racket is slightly more complicated, since random-seed can only be called on the default PRNG (the result of calls to random with no PRNG specified). To use random-seed with a different PRNG, one would have to temporarily set it as the default, and then restore the original when done.

I eventually switched away from this reset method. What I did instead was to pass an entire PRNG in a known state as the reset value. My Method 2 makes use of another procedure that copies the state of a PRNG without changing it. We can’t simply use set! on the PRNG variable, since otherwise any later call to the PRNG we passed would affect our random function, and destroy the ‘known state’ it had when created. This does complicate how one chooses a new state in some situations, although the text implies more concern about repeating a sequence than about picking reset states. As long as the PRNG isn’t used elsewhere, it will remain in the state it was.

In this case rand handles the dispatch directly, and it is a little shorter. However, there is the already-mentioned problem that the external variable has; it can be modified outside of the rand procedure, ruining the repeatability of the sequence. Obviously a hybrid approach can make use of both methods — keeping the PRNG contained only in rand, and making use of the language-supported means of generating the PRNG to reduce the work we need to do.

Solutions 3.1.2