Yi Su, PhD




Resume
pdf  ps  txt

Email
suy AT jhu DOT edu

Address
322 CSEB
Johns Hopkins Univ.
3400 N. Charles St.
Baltimore, MD 21218

Personal
blog (feed)

Thanks to
get homepaged
apple inc.
tiny counter


Introduction

The Random Forest Language Model Toolkit is a C++ software package based on the SRI LM Toolkit. We follow the design philosophy and coding conventions of the SRI LM Toolkit and use their low-level data structure classes as well as some higher level procedural code to "stand on the shoulders of giants".

The Random Forest Language Model is a collection of randomized decision tree language models with proven records of good performance.

Download

rflm-0.9.5.tar.gz

rflm-0.9.4.tar.gz (for SRI LM Toolkit <=1.5.8)

Terms of Use

The toolkit is subject to the SRILM Community Research License Version 1.0 (the "License"). A copy of the License is included in the package. Basically it says that you are free to use it for non-commercial purposes as long as you share your modifications with the community.

References

Peng Xu and Frederick Jelinek. Random forests in language modeling. In Proceedings of EMNLP 2004, pages 325-332, Barcelona, Spain, 2004. Association for Computational Linguistics. [ pdf ]

Peng Xu. Random forests and the data sparseness problem in language modeling. PhD thesis, Johns Hopkins University, 2005. [ ps ]

Yi Su, Frederick Jelinek, and Sanjeev Khudanpur. Large-scale random forest language models for speech recognition. In Proceedings of INTERSPEECH-2007, volume 1, pages 598-601, Antwerp, Belgium, 2007. [ pdf ]

Yi Su. Knowledge intergration into language models: a random forest approach. PhD thesis, Johns Hopkins University, 2009. [ hyperref ] [ regular ]