2008
Benoît Sagot - WOLF
by parmentierfLe WOLF (Wordnet Libre du Français) est une ressource lexicale sémantique (wordnet) libre pour le français.
2006
Official Google Research Blog: All Our N-gram are Belong to You
by parmentierf & 1 other (via)Here at Google Research we have been using word n-gram models for a variety of R&D projects, such as statistical machine translation, speech recognition, spelling correction, entity detection, information extraction, and others. While such models have usually been estimated from training corpora containing at most a few billion words, we have been harnessing the vast power of Google's datacenters and distributed processing infrastructure to process larger and larger training corpora. We found that there's no data like more data, and scaled up the size of our data by one order of magnitude, and then another, and then one more - resulting in a training corpus of one trillion words from public Web pages.
2004
Natural Language Toolkit
by parmentierf (via)The Natural Language Toolkit is a suite of Python packages and data for natural language processing; it comes with extensive API documentation and tutorials. NLTK-Lite is the version under active development.
1
(7 marks)