Tolerating some redundancy significantly speeds up clustering of large protein databases
scientific article published in January 2002