Well I didn't seem to notice that the last iteration of the indexing system was missing a lot of text and papers we're jumbled or missing. So I'm rewriting the engine that reads each paper and indexes it. I don't like the algorithm as it stands. The thesaurus, though, looks good and works well with the blocks that exist in the database so I should be done with the index when this new algorithm is rewritten. An added benefit is that it should index the other languages also.
No comments:
Post a Comment