Hal Daumé III

Professor

4134 Iribe Center

(301) 405-1073

Academic Web Page

Research Group(s):

Computational Linguistics and Information Processing Lab

Human-Computer Interaction Lab

Institute for Trustworthy AI in Law & Society

Education:

University of Southern California (Computer Science)

Biography:

Hal Daumé III is a professor of computer science with an appointment in the University of Maryland Institute for Advanced Computer Studies. He is director of the Institute for Trustworthy AI in Law & Society (TRAILS).

Daumé’s research is focused on understanding computational properties of learning and language as well as trustworthy AI. He studies questions related to how to get machines to become more adept at human language (and AI tasks more broadly), by developing models and algorithms that allow them to learn from data.

Go here to view Daumé's academic publications on Google Scholar.

Publications

2011

Teo CL, Yang Y, Daumé H, Fermüller C, Aloimonos Y. 2011. A Corpus-Guided Framework for Robotic Visual Perception. Workshops at the Twenty-Fifth AAAI Conference on Artificial Intelligence.

Pujara J, Daumé H, Getoor L. 2011. Using classifier cascades for scalable e-mail classification. Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference, ACM International Conference Proceedings Series.

2009

Goyal A, Daumé H, Venkatasubramanian S. 2009. Streaming for large scale NLP: Language modeling. Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. :512-520.

Agarwal A, Daumé H. 2009. Exponential family hybrid semi-supervised learning. Proceedings of the 21st International Joint Conference on Artifical Intelligence (IJCAI-09). :974-979.

Rai P, Daumé H. 2009. The infinite hierarchical factor regression model. Arxiv preprint arXiv:0908.0570.

Daumé H. 2009. Fast search for Dirichlet process mixture models. Arxiv preprint arXiv:0907.1812.

Rai P, Daumé H. 2009. Multi-label prediction via sparse infinite CCA. Advances in Neural Information Processing Systems. 22:1518-1526.

Daumé H. 2009. Bayesian multitask learning with latent hierarchies. Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence. :135-142.

Daumé H. 2009. Semi-supervised or semi-unsupervised? Proceedings of the NAACL HLT Workshop on Semisupervised Learning for Natural Language Processing. :84-85.

Daumé H. 2009. Unsupervised search-based structured prediction. Proceedings of the 26th Annual International Conference on Machine Learning. :209-216.

Daumé H. 2009. Non-parametric bayesian areal linguistics. Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. :593-601.

Rai P, Daumé H, Venkatasubramanian S. 2009. Streamed learning: one-pass SVMs. Proceedings of the 21st international jont conference on Artifical intelligence. :1211-1216.

Daumé H. 2009. Markov random topic fields. Proceedings of the ACL-IJCNLP 2009 Conference Short Papers. :293-296.

2008

Hermjakob U, Knight K, Daumé H. 2008. Name translation in statistical machine translation: Learning when to transliterate. Proceedings of ACL-08: HLT. :389-397.

Daumé H. 2008. Cross-task knowledge-constrained self training. Proceedings of the Conference on Empirical Methods in Natural Language Processing. :680-688.

Liu P, Shi Q, Daumé H, Voth GA. 2008. A Bayesian statistics approach to multiscale coarse graining. The Journal of chemical physics. 129:214114-214114.

2007

Daumé H, Campbell L. 2007. A Bayesian model for discovering typological implications. ANNUAL MEETING-ASSOCIATION FOR COMPUTATIONAL LINGUISTICS. 45:65-65.

Daumé H. 2007. Frustratingly easy domain adaptation. Annual meeting-association for computational linguistics. 45:256-256.

2006

Daumé H, Marcu D. 2006. Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research. 26(1):101-126.

Daumé H, Marcu D. 2006. Bayesian query-focused summarization. Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. :305-312.

Daumé H, Marcu D. 2006. A Bayesian model for supervised clustering with the Dirichlet process prior. Journal of Machine Learning Research. 6(2):1551-1551.

2005

Daumé H, Marcu D. 2005. Bayesian summarization at DUC and a suggestion for extrinsic evaluation. Document understanding conference.

Daumé H, Marcu D. 2005. Learning as search optimization: approximate large margin methods for structured prediction. Proceedings of the 22nd international conference on Machine learning. :169-176.

Daumé H, Marcu D. 2005. A large-scale exploration of effective global features for a joint entity detection and tracking model. Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. :97-104.

Daumé H, Marcu D. 2005. Induction of Word and Phrase Alignments for Automatic Document Summarization. Computational Linguistics. 31:505-530.

Daumé H, Langford J, Marcu D. 2005. Search-based structured prediction as classification. NIPS Workshop on Advances in Structured Learning for Text and Speech Processing, Whistler, Canada.

2004

Daumé H, Marcu D. 2004. A tree-position kernel for document compression. Proceedings of the Fourth Document Understanding Conference (DUC 2004). :6-7.

Daumé H, Brill E. 2004. Web search intent induction via automatic query reformulation. Proceedings of HLT-NAACL 2004: Short Papers on XX. :49-52.

Daumé H, Marcu D. 2004. Supervised clustering with the dirichlet process. NIPS'04 Learning With Structured Outputs Workshop.

Daumé H, Marcu D. 2004. A phrase-based hmm approach to document/abstract alignment. Proceedings of EMNLP. :119-126.

Daumé H, Marcu D. 2004. Np bracketing by maximum entropy tagging and SVM reranking. Proceedings of EMNLP. 4

Daumé H, Marcu D. 2004. Generic sentence fusion is an ill-defined summarization task. Proceedings of the Text Summarization Branches Out Workshop at ACL. 4:96-103.

2002

Daumé H, Marcu D. 2002. A noisy-channel model for document compression. Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. :449-456.

Daumé H, Echihabi A, Marcu D, Munteanu D, Soricut R. 2002. GLEANS: A generator of logical extracts and abstracts for nice summaries. Workshop on Automatic Summarization. :9-14.

Daumé H, Knight K, Langkilde-Geary I, Marcu D, Yamada K. 2002. The importance of lexicalized syntax models for natural language generation tasks. Proc. of INLG. :9-16.

2001

Nyberg E, Daumé H. 2001. Integrated information management: an interactive, extensible architecture for information retrieval. Proceedings of the first international conference on Human language technology research. :1-6.