PhD Student at Umass Amherst
oh I see, by all(sensor_preds) I meant sum([logit_i] for i in n_sensors) (the probability that all sensors are activated). Makes sense, thanks!
is individual measurement prediction AUROC a) or b)
a) mean(AUROC(sensor_i_pred, sensor_i))
b) AUROC(all(sensor_preds), all(sensors))
did the paper report accuracy of the pure prediction model (on the pure prediction task)? (trying to replicate and want a sanity check).
IMO most exciting mech-interp research since SAEs, great work.
A few thoughts / questions: