In this post Google mention PLSI (Probabilistic latent semantic indexing) and also Latent Dirichlet Allocation as examples of varients to LSI.
It's different because instead of treating the document as a bag of words, it uses a Temporal Markov Structure.
This supports my post about how LSI in its very basic form as summarized in various places as well as the excellent Wikipedia is not the variety used in Google, whatever Matt Cutts says. Yes it is used, but he doesn't give away the important information, what he presents is a very very basic version. It's like saying "Yes, we use glue in our computer chips" or "Yes, here at NASA we use Glue as an adhesive for our rockets". It's unlikely to be the glue your child uses at playschool :)
No comments:
Post a Comment