Thank you for the measured, thoughtful reviews.  Thank you also for
the helpful pointers to related work.

We agree with your assessments and have identified no obvious
misunderstandings: our paper tries to show how to implement a
sophisticated optimization engine in a principled way.  We address your
concerns below; please don't feel that you must read all the details.

-----------------------------------------------------------------

We're red-faced about the overenthusiastic tone of the paper.  
A revised version will adopt a more suitable tone.

Regarding performance, we have only very rough data.  Using our ideas
in an eager language (Objective Caml) resulted in a slight performance
*improvement*, as documented in our 2005 paper.  The version of GHC
with our code runs roughly 15% slower than the version without, but
this version of GHC includes wholesale changes to the back end, and we
don't know what portion of the slowdown can be attributed to the
optimizer.  In short, we think it unlikely that we are taking a "big hit."

Regarding edges, we must have misstated something: the term 'edges' is
intended to refer to a relation between nodes, not just between
blocks.  We fear we've misled you by our use of the word 'Edges' to
describe a type class; a better name might be 'NonLocal', as the
operations of this class gather the information needed to understand
how control flows between basic blocks.

A function call is indeed represented as an open/closed call node with
a closed/open return continuation.  Multiple calls may share a return
continuation, and one of the optimizations we've implemented in GHC
arranges for call sites to share return continuations when possible.

Regarding use cases, we do support other control-flow algorithms; for
example, we use Hoopl to compute dominators by a purely functional
variant of the algorithm of Cooper, Harvey, and Kennedy (2001).
We have also implemented a small suite of pure control-flow
optimizations that that do not use any dataflow analysis, such as the
elimination of branches to branches.  We have not contemplated trying
to extend Hoopl to establish or maintain SSA invariants.

As noted by Referee D, soundness is not at all obvious, and it is
definitely relative to the soundness of the transfer functions.  
We don't prove soundness; instead, we appeal to the proof provided by
Lerner, Grove, and Chambers (2002).  We'll try to make this appeal
clearer.

In Section 4.8, column 1, after each iteration we throw away the
rewritten graph but we keep the improved FactBase.  When the FactBase
stops improving, we have reached a fixed point, at which stage we keep
*both* the improved FactBase *and* the rewritten graph.  Our use of
the term "virgin" to describe the original graph is gratuitous and
without meaning; we'll remove it.

Your point about GADTs and their utility is well taken: from the
client's point of view, we could do almost as well with phantom types.
The major exception is that without GADTs, a client would have to
provide smart constructors for nodes: the client must ensure that each
time a node is created with a particular value constructor, it always
gets the same phantom type.

The definition of 'rewriteE' appears in a 'where' clause in Figure 5
on page 7.

Expansion-Passing Style does indeed look related to our rewrite
functions that return rewrite functions, but they are not identical.
To elucidate that relationship, we will have to dig deeper.

We'll improve the tone of section 5---but we would welcome your joint
opinion about whether condensing that section would make the paper better.
