The problem with Bayes is that one must try all probability distributions with all possible parameters. I still think there should be a way to avoid this. What about this one?
What Bayes is actually doing is comparing how low is the cost in the actual configuration, compared with how low it could have been. We might do it by finding the ratio of the cost in the actual configuration, over the average cost for all configurations. This average cost might be computed along each neuron, or at a random sample in the whole space.