Aaron Staple

2014-10-27 02:33:27 UTC

Greetings sklearn developers,

Iâm a new sklearn contributor, and Iâve been working on a small project to

allow customization of the scoring metric used when scoring out of bag data

for random forests (see

https://github.com/scikit-learn/scikit-learn/pull/3723). In this PR,

@mblondel and I have been discussing an architectural issue that we would

like others to weigh in on.

While working on my implementation, Iâve run into a bit of difficulty using

the scorer implementation as it exists today - in particular, with the

interface expressed in _BaseScorer. The current _BaseScorer interface is

callable, accepting an estimator (utilized as a Predictor), along with some

prediction data points X, and returning a score. The various _BaseScorer

implementations compute a score by calling estimator.predict(X),

estimator.predict_proba(X), or estimator.decision_function(X) as needed,

possibly applying some transformations to the results, and then applying a

score function.

The issue Iâve run into is that predicting out of bag samples is a rather

specialized procedure because the model used differs for each training

point, based on how that point was used during fitting. Computing these

predictions is not particularly suited for implementation as a Predictor.

In addition, in the PR weâve been discussing that idea that a random forest

estimator will make its out of bag predictions available as attributes,

allowing a user of the estimator to subsequently score these provided

predictions. Also, @mblondel mentioned that for his work on multiple-metric

grid search, he is interested in scoring predictions he computes outside of

a Predictor.

The difficulty is that the current scorers take an estimator and data

points, and compute predictions internally. They donât accept externally

computed predictions.

Iâve written up a series of different generalized options for implementing

a system of scoring externally computed predictions (some are likely

undesirable but are provided as points of comparison):

1) Add a new implementation thatâs completely separate from the existing

_BaseScorer class.

2) Use the existing _BaseScorer without changes. This means abusing the

Predictor interface and creating something like a dummy predictor that

ignores X and returns the externally computed predictions - predictions not

inherently based on the X variable, but which were externally computed

based on a known X value.

3) Add a private api to _BaseScorer for scoring externally computed

predictions. The private api can be called by a public helper function in

scorer.py.

4) Change the public api of _BaseScorer to make scoring of externally

computed predictions a public operation along with the existing

functionality. Also possibly rename _BaseScorer => BaseScorer.

5) Change the public api of _BaseScorer so that it only handles externally

computed predictions. The existing functionality would be implemented by

the caller (as a callback, since the required type of prediction data is

not known by the caller).

So far in the PR weâve been looking at options 2, 3, and 4, with 4 seeming

like a good candidate. Once we decide on one of these options, Iâd like to

follow up with stakeholders on the specifics of what the new interface will

look like.

Thanks,

Aaron Staple

Iâm a new sklearn contributor, and Iâve been working on a small project to

allow customization of the scoring metric used when scoring out of bag data

for random forests (see

https://github.com/scikit-learn/scikit-learn/pull/3723). In this PR,

@mblondel and I have been discussing an architectural issue that we would

like others to weigh in on.

While working on my implementation, Iâve run into a bit of difficulty using

the scorer implementation as it exists today - in particular, with the

interface expressed in _BaseScorer. The current _BaseScorer interface is

callable, accepting an estimator (utilized as a Predictor), along with some

prediction data points X, and returning a score. The various _BaseScorer

implementations compute a score by calling estimator.predict(X),

estimator.predict_proba(X), or estimator.decision_function(X) as needed,

possibly applying some transformations to the results, and then applying a

score function.

The issue Iâve run into is that predicting out of bag samples is a rather

specialized procedure because the model used differs for each training

point, based on how that point was used during fitting. Computing these

predictions is not particularly suited for implementation as a Predictor.

In addition, in the PR weâve been discussing that idea that a random forest

estimator will make its out of bag predictions available as attributes,

allowing a user of the estimator to subsequently score these provided

predictions. Also, @mblondel mentioned that for his work on multiple-metric

grid search, he is interested in scoring predictions he computes outside of

a Predictor.

The difficulty is that the current scorers take an estimator and data

points, and compute predictions internally. They donât accept externally

computed predictions.

Iâve written up a series of different generalized options for implementing

a system of scoring externally computed predictions (some are likely

undesirable but are provided as points of comparison):

1) Add a new implementation thatâs completely separate from the existing

_BaseScorer class.

2) Use the existing _BaseScorer without changes. This means abusing the

Predictor interface and creating something like a dummy predictor that

ignores X and returns the externally computed predictions - predictions not

inherently based on the X variable, but which were externally computed

based on a known X value.

3) Add a private api to _BaseScorer for scoring externally computed

predictions. The private api can be called by a public helper function in

scorer.py.

4) Change the public api of _BaseScorer to make scoring of externally

computed predictions a public operation along with the existing

functionality. Also possibly rename _BaseScorer => BaseScorer.

5) Change the public api of _BaseScorer so that it only handles externally

computed predictions. The existing functionality would be implemented by

the caller (as a callback, since the required type of prediction data is

not known by the caller).

So far in the PR weâve been looking at options 2, 3, and 4, with 4 seeming

like a good candidate. Once we decide on one of these options, Iâd like to

follow up with stakeholders on the specifics of what the new interface will

look like.

Thanks,

Aaron Staple