Changes between Version 5 and Version 6 of tutorial/ProbabilisticLearningModels


Ignore:
Timestamp:
08/14/09 09:40:47 (15 years ago)
Author:
horak
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • tutorial/ProbabilisticLearningModels

    v5 v6  
    3232Feature extraction is the task of extracting features from examples. 
    3333{{{ 
    34 E.g., In our document classification scenario, a tokenizer that extracts words from text might be used for feature extraction. 
     34E.g., In our document classification scenario, a tokenizer that extracts words from  
     35text might be used for feature extraction. 
    3536}}} 
    3637In more sophisticated scenarios, feature extraction can be hierarchically nested by extracting new features from existing feature lists. 
    3738{{{ 
    38 E.g., In our document classification scenario, a word n-gram algorithm extracts n-gram features from extracted word sequences.  
     39E.g., In our document classification scenario, a word n-gram algorithm extracts n-gram  
     40features from extracted word sequences.  
    3941}}} 
    4042 
     
    4446First, not all features are useful for separating different classes. In details, there is no statistically significant dependency between class and feature occurance. 
    4547{{{ 
    46 E.g., In our document classification scenario, stop words or high frequent words are not useful for separating e.g., spam mails from ham mails. 
     48E.g., In our document classification scenario, stop words or high frequent words are  
     49not useful for separating e.g., spam mails from ham mails. 
    4750}}} 
    4851Second, just a small set of features might be enough for classifiying examples successfully. Adding more just decreases  performance.