I'm currently implementing a BoVW as part of my lab project. The steps the algorithm used are as follows:
now, my PI wants me to implements one more step: feature extraction. He suggested PCA, tSNE, auto-encoder and so on.In a lot of implementions I checked out for BoVW there is such a thing, usually auto-encoder, so it's a reasonable request.
The thing is: why is this step even needed? Isnt just by describing each photo as an histogram we applied feature extraction of some sort ? Why is adding this step we add no value, but just overcomplicate things?
Thank you all for any insights to this.