Term Operations
Stemming
- Conflation of related words, usually reducing to a common root -- do not confuse with truncation
- e.g., psychlog (psychologist, psychological, psychology)
- Sometimes this automatically done on-the-fly at query time
- Also done at the time index is created (index is consequently much smaller)
Term Weighting
- (weights are determined for each term)
- used for relevance and ranking determinations
- Sometimes done on-the-fly
- term-weight (based on inter-document and inter-database frequencies) are computed and stored at time of indexing (recalculation = $$$)