Hi Ao,
Regarding question 1: for now, go ahead and assume diminishing sensitivity. If the estimator turns out to work well, we might end up working to see if we can avoid that assumption later. But as a first-round of work on implementation, it's a completely reasonable thing to assume, I think. Although if you disagree, do let me know.
Regarding question 2: I am aware of Vince's work on this; I saw him present it way back in 2013, and we tried to organize a conference section together. I think only an abstract exists, and to be honest, I'm not sure how their classification approach works. Not enough details were made public, last I checked. But maybe you've seen more than I have at this point, if you saw the presentation. It'd be great if you could forward to abstract and/or paper if you have it.
Ultimately, it would be great to explore heterogeneity in reference points in the long run. The test we are building is actually pretty well suited to studying heterogeneity by observables. For unobservable heterogeneity, it's a bit trickier. Estimating reference points is hard enough that I think even the homogeneous test is a significant step forward, and I think it would be useful and publishable without that component. That said, once we have that test working, we will want to explore heterogeneous extensions, either to improve paper 1 or to form a new second paper.