| Yi Su, PhD |
|
|
|
Introduction The Random Forest Language Model Toolkit is a C++ software package based on the SRI LM Toolkit. We follow the design philosophy and coding conventions of the SRI LM Toolkit and use their low-level data structure classes as well as some higher level procedural code to "stand on the shoulders of giants". The Random Forest Language Model is a collection of randomized decision tree language models with proven records of good performance. Downloadrflm-0.9.4.tar.gz (for SRI LM Toolkit <=1.5.8) Terms of UseThe toolkit is subject to the SRILM Community Research License Version 1.0 (the "License"). A copy of the License is included in the package. Basically it says that you are free to use it for non-commercial purposes as long as you share your modifications with the community. ReferencesPeng Xu and Frederick Jelinek. Random forests in language modeling. In Proceedings of EMNLP 2004, pages 325-332, Barcelona, Spain, 2004. Association for Computational Linguistics. [ pdf ] Peng Xu. Random forests and the data sparseness problem in language modeling. PhD thesis, Johns Hopkins University, 2005. [ ps ] Yi Su, Frederick Jelinek, and Sanjeev Khudanpur. Large-scale random forest language models for speech recognition. In Proceedings of INTERSPEECH-2007, volume 1, pages 598-601, Antwerp, Belgium, 2007. [ pdf ] Yi Su. Knowledge intergration into language models: a random forest approach. PhD thesis, Johns Hopkins University, 2009. [ hyperref ] [ regular ] |