Words & Pictures
Tamara Berg, Associate Professor, University of North Carolina Chapel Hill
Much of everyday language and discourse concerns the visual world around us, making understanding the relationship between the physical world and language describing that world an important challenge problem for AI. Comprehending the complex and subtle interplay between the visual and linguistic domains will have broad applicability toward inferring human-like understanding of images, producing natural human-robot interactions, and grounding natural language. In computer vision, along with improvements in deep learning based visual recognition, there has been an explosion of recent interest in methods to automatically generate natural language outputs for images and videos. In this talk I will describe our group's efforts to understand and produce relevant natural language about images, from developing early methods to generate complete and human-like image descriptions, to moving beyond general image descriptions toward more focused natural language, such as referring expressions and question-answering.
Tamara Berg received her B.S. in Mathematics and Computer Science from the University of Wisconsin, Madison in 2001. She then completed a PhD at the University of California, Berkeley in 2007 and spent 1 year as a research scientist at Yahoo! Research. From 2008 to 2013 Tamara was an Assistant Professor in the computer science department at Stony Brook University and core member of the consortium for Digital Art, Culture, and Technology (cDACT). In 2013, Tamara joined the University of North Carolina Chapel Hill as an Assistant Professor and was promoted to tenured Associate Professor in 2015. She is also co-founder of Shopagon Inc, a start-up in the computer vision and retail space that uses artificial intelligence algorithms to personalize the online clothing shopping experience. Tamara is a recipient of the NSF Career Award, the 2013 Marr Prize, and the 2016 Hettleman Award. Her research straddles the boundary between Computer Vision and Natural Language Processing with applications to large-scale recognition, retrieval, fashion, and social network analysis.