the penultimate layer of the network (so not the layer that says which class the image falls into, but the one before that
Yeah, it is sort of the same thing though. The last layer, softmax, just normalizes the scores, so they could be treated as probabilities. So, the highest score in the penultimate layer gives you the class too. The authors say that normalization screws up their approach, so that is why they are working with the penultimate layer.
For each class j, you take all the images that were correctly characterized to be in that class, and you take all of their activation vectors and you call that collection of vectors "S".
Yes, up to this point it is clear.
Then you compute a mean for each "weight" in S. So for the 1st weight, you take all of the values this weight takes on for each of the correctly classified image and you compute a mean, and then so on and so on. That's your mean "Activation Vector."
OK, that is what I thought too -- mu_j is the mean activation vector for class j -- and that is what I wrote in my post (my understanding of the general idea). But looking at the notation in the algorithm description, I am not sure it is coordinatewise mean. (I think coordinatewise is what you mean by "mean for each weight".) I find the notation very confusing.
Anyway, it is good to know that you think it is coordinatewise mean too.
So for each weight it's an absolute value of a difference between the "furthest outlying" value and the mean value. That's what S_hat means in this case, I think: the most "extreme" value of all available for that weight, and we are finding the difference between that and the mean.
I don't think this is right, but I may be missing something. That FitHigh function should return the three parameters of Weibull distribution. So, I thought it looked at the distribution of the norms/distances from the activations of correctly identified inputs to the mean per each class and returned the parameters that fit that distribution.
no subject
Date: 2018-05-10 02:44 am (UTC)Yeah, it is sort of the same thing though. The last layer, softmax, just normalizes the scores, so they could be treated as probabilities. So, the highest score in the penultimate layer gives you the class too. The authors say that normalization screws up their approach, so that is why they are working with the penultimate layer.
For each class j, you take all the images that were correctly characterized to be in that class, and you take all of their activation vectors and you call that collection of vectors "S".
Yes, up to this point it is clear.
Then you compute a mean for each "weight" in S. So for the 1st weight, you take all of the values this weight takes on for each of the correctly classified image and you compute a mean, and then so on and so on. That's your mean "Activation Vector."
OK, that is what I thought too -- mu_j is the mean activation vector for class j -- and that is what I wrote in my post (my understanding of the general idea). But looking at the notation in the algorithm description, I am not sure it is coordinatewise mean. (I think coordinatewise is what you mean by "mean for each weight".) I find the notation very confusing.
Anyway, it is good to know that you think it is coordinatewise mean too.
So for each weight it's an absolute value of a difference between the "furthest outlying" value and the mean value. That's what S_hat means in this case, I think: the most "extreme" value of all available for that weight, and we are finding the difference between that and the mean.
I don't think this is right, but I may be missing something. That FitHigh function should return the three parameters of Weibull distribution. So, I thought it looked at the distribution of the norms/distances from the activations of correctly identified inputs to the mean per each class and returned the parameters that fit that distribution.
Does this help any?
Sure it does. Thanks a lot for taking the time.