Robust hash functions are central to the security of multimedia content authentication systems. Such functions are sensitive to a key but are robust to many allowed signal processing operations on the underlying content. The robustness of the hash function to changes in the original content implies the existence of a cluster in the feature space around the original contents feature vector, any point within which getting hashed to the same output. The shape and size of the cluster determines the trade-off between the robustness offered and the security of the authentication system based on the robust hash function. The clustering itself is based on a secret key and hence unknown to the attacker. However, we show that the specific clustering arrived at by the robust visual hash function (VHF) may be possible to learn. Given just an input and its hash bits, we show how to construct a statistical model of the hash function, without any knowledge of the secret key used to compute the hash. We also show how to use this model to engineer arbitrary and malicious collisions. Finally, we propose one possible modification to VHF so that constructing a model that mimics its behavior becomes difficult.