On Mon September 06, 2021

Speaker

Hanjoon Kim (FuriosaAI / CTO)


Title

Challenges of building inference chips for data centers


Abstract

According to papers from hyperscale datacenters[1][2], the demand for deep learning inference in data centers is growing rapidly. While energy efficiency is important to reduce TCO (total cost of ownership), high performance is also essential to serve large models in production. Hyperscalers, on the other hand, emphasized the importance of programmability and flexibility for inference accelerators to track DNN progress[1]. In order to build a production accelerator for all these challenging requirements, instead of building a chip that is optimized for a specific model, the architecture should expose the raw ability to maximize parallelism and energy efficiency of DNN models to the software with well-defined abstraction. Software stack should also exploit every parallelism and energy efficiency for each operator and model. To accomplish such cross-layer optimizations within algorithm, architecture, and software, small and excellent teams must communicate deeply and closely, and design methodologies and the infrastructures must support these communication structures.
[1] Norman P. Jouppi et al., Ten Lessons From Three Generations Shaped Google’s TPUv4i : Industrial Product, ISCA'21
[2] Michael Anderson et al., First-Generation Inference Accelerator Deployment at Facebook, https://arxiv.org/abs/2107.04140


Bio

Hanjoon Kim is co-founder and CTO of FuriosaAI Inc. He is leading AI chip development, setting the engineering direction and technology vision. Prior to FuriosaAI Inc, he lead the development of memory-centric accelerator architecture targeting hyperscale datacenter at Samsung Memory. He holds a PhD in Computer Science from KAIST.


Language

Korean

On Mon September 13, 2021

Speaker

Michael Bronstein (Imperial College London / Professor, Twitter / Head of Graph ML Research)


Title

Geometric Deep Learning: from Euclid to drug design


Abstract

For nearly two millennia, the word “geometry” was synonymous with Euclidean geometry, as no other types of geometry existed. Euclid’s monopoly came to an end in the 19th century, where multiple examples of non-Euclidean geometries were shown. However, these studies quickly diverged into disparate fields, with mathematicians debating the relations between different geometries and what defines one. A way out of this pickle was shown by Felix Klein in his Erlangen Programme, which proposed approaching geometry as the study of invariants or symmetries using the language of group theory. In the 20th century, these ideas have been fundamental in developing modern physics, culminating in the Standard Model.
The current state of deep learning somewhat resembles the situation in the field of geometry in the 19h century: On the one hand, in the past decade, deep learning has brought a revolution in data science and made possible many tasks previously thought to be beyond reach — including computer vision, playing Go, or protein folding. At the same time, we have a zoo of neural network architectures for various kinds of data, but few unifying principles. As in times past, it is difficult to understand the relations between different methods, inevitably resulting in the reinvention and re-branding of the same concepts.
Geometric Deep Learning aims to bring geometric unification to deep learning in the spirit of the Erlangen Programme. Such an endeavour serves a dual purpose: it provides a common mathematical framework to study the most successful neural network architectures, such as CNNs, RNNs, GNNs, and Transformers, and gives a constructive procedure to incorporate prior knowledge into neural networks and build future architectures in a principled way.
In this talk, I will overview the mathematical principles underlying Geometric Deep Learning on grids, graphs, and manifolds, and show some of the exciting and groundbreaking applications of these methods in the domains of computer vision, social science, biology, and drug design.
(based on joint work with J. Bruna, T. Cohen, P. Veličković)


Bio

Michael Bronstein is a professor at Imperial College London, where he holds the Chair in Machine Learning and Pattern Recognition, and Head of Graph Learning Research at Twitter. He also heads ML research in Project CETI, a TED Audacious Prize-winning collaboration aimed at understanding the communication of sperm whales. Michael received his PhD from the Technion in 2007. He has held visiting appointments at Stanford, MIT, and Harvard, and has also been affiliated with three Institutes for Advanced Study (at TUM as a Rudolf Diesel Fellow (2017-2019), at Harvard as a Radcliffe fellow (2017-2018), and at Princeton as a short-time scholar (2020)). Michael is the recipient of the Royal Society Wolfson Research Merit Award, Royal Academy of Engineering Silver Medal, five ERC grants, two Google Faculty Research Awards, and two Amazon AWS ML Research Awards. He is a Member of the Academia Europaea, Fellow of IEEE, IAPR, BCS, and ELLIS, ACM Distinguished Speaker, and World Economic Forum Young Scientist. In addition to his academic career, Michael is a serial entrepreneur and founder of multiple startup companies, including Novafora, Invision (acquired by Intel in 2012), Videocites, and Fabula AI (acquired by Twitter in 2019). He has previously served as Principal Engineer at Intel Perceptual Computing and was one of the key developers of the Intel RealSense technology.


Language

English

On Mon September 27, 2021

Speaker

Andreas Geiger (University of Tübingen / Professor, MPI for Intelligent Systems / Group Leader)


Title

Neural Implicit Representations for 3D Vision


Abstract

In this lecture, I will show recent results on learning neural implicit 3D representations, departing from the traditional paradigm of representing 3D shapes explicitly using voxels, point clouds or meshes. Neural implicit representations have a small memory footprint and allow for modeling arbitrary 3D topologies at (theoretically) arbitrary resolution in continuous function space. I will discuss the ability and limitations of these approaches in the context of reconstructing 3D geometry (Occupancy Networks), appearance (Implicit Surface Light Fields) and motion (Occupancy Flow). I will further demonstrate how implicit representations can be learned using only 2D supervision through implicit differentiation of the level set constraint (Differentiable Volumetric Rendering). Finally, I will give a brief outlook on follow-up works including NeRF, GRAF, GIRAFFE and KiloNeRF.


Bio

Andreas Geiger is a full professor at the University of Tübingen. Prior to this, he was a visiting professor at ETH Zürich and a group leader at the Max Planck Institute for Intelligent Systems. He studied at KIT, EPFL and MIT, and received his PhD degree in 2013 from the Karlsruhe Institute of Technology (KIT). His research interests are at the intersection of computer vision, machine learning and robotics, with a particular focus on 3D scene perception, deep representation learning, generative models and sensori-motor control in the context of autonomous systems. In 2012, he has published the KITTI vision benchmark suite which has become one of the most influential testbeds for evaluating stereo, optical flow, scene flow, detection, tracking, motion estimation and segmentation algorithms. His work has been recognized with several prizes, including the IEEE PAMI Young Investigator Award, the Heinz Maier Leibnitz Prize of the German Science Foundation and the German Pattern Recognition Award. In 2013 and 2021 he received the CVPR best paper and best paper runner-up awards. He also received the best paper award at GCPR 2015 and 3DV 2015 as well as the best student paper award at 3DV 2017. In 2019, he was awarded a starting grant by the European Research Council. He is a board member of the ELLIS initiative and associate faculty of the International Max Planck Research School (IMPRS) for Intelligent Systems. He coordinates the ELLIS PhD and PostDoc program. He regularly serves as area chair and associate editor for several computer vision conferences and journals including CVPR, ICCV, ECCV, PAMI and IJCV.


Language

English

On Fri October 08, 2021

Speaker

Shamsi Iqbal (Microsoft / Principal Researcher)


Title

Towards Individual Empowerment through Productivity and Wellbeing Balance


Abstract

As our work environments and work practices rapidly evolve as a result of the changing landscape of work, what we envision as the future of work is being fundamentally challenged. Research in the area of productivity and multitasking has to adapt to the changing world anticipating what the future may look like - in particular taking into account growing needs of balancing work and life. In this session I will talk about redefining productivity for the new future of work, where many organizations are switching to hybrid. In hybrid scenarios, the need to do things while on the go or while in divided attention scenarios will continue to dominate and where taking care of wellbeing will rapidly become a core necessity for sustained productivity. I will discuss strategies of how thoughtful design of tools and products can promote wellbeing, even when the core functionality of the tool is not about wellbeing and what organizations should be doing to adapt to a successful future of hybrid work.


Bio

Dr. Shamsi T. Iqbal is a Principal Researcher in the Productivity and Intelligence group (P+I) in Microsoft Research, Redmond. Her primary expertise is in the domain of Attention Management and Interruptions. More recently her work has focused on redefining productivity, introducing novel ways of being productive through leveraging micromoments and balancing productivity and well-being in interaction design. Shamsi's research related to the Future of Work has been covered in the Economist, the Atlantic and the Technology Review Germany in recent times. Her work on driving and distraction was featured in the New York Times, MIT Tech Review among others, and also covered by the King 5 News (NBC affiliate in the Seattle area). Shamsi has served on many organizing and program committees for Human-Computer Interaction conferences, is currently serving as an ACM TOCHI Associate Editor, guest Editor for IEEE Pervasive Special Issue on Future of Work and was the General Co-chair for UIST 2020. She is one of the co-authors of the document Microsoft released on the Future of Remote Work in 2021. Shamsi received her Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign in 2008 and Bachelors in Computer Science and Engineering from Bangladesh University of Engineering and Technology in 2001.


Language

English

On Fri October 15, 2021

Speaker

Kristen Grauman (UT Austin / Professor)


Title

First-Person Video for Understanding Interactions


Abstract

Today’s perception systems excel at naming things in third-person Internet photos or videos, which purposefully convey a visual scene or moment. In contrast, first-person or “egocentric” perception requires understanding the multi-modal video that streams to a person’s (or robot’s) wearable camera. While video from an always-on wearable camera lacks the curation of an intentional photographer, it does provide a special window into the camera wearer’s attention, goals, and interactions with people and objects in her environment. These factors make first-person video an exciting avenue for the future of perception in augmented reality and robot learning.

Motivated by this setting, I will present our recent work on first-person video. First, we explore learning visual affordances to anticipate how objects and spaces can be used. We show how to transform egocentric video into a human-centric topological map of a physical space (such as a kitchen) that captures its primary zones of interaction and the activities they support. Moving down to the object level, we develop video anticipation models that localize interaction “hotspots” indicating how/where an object can be manipulated (e.g., pressable, toggleable, etc.). Towards translating these affordances into robot action, we prime reinforcement learning agents to prefer human-like interactions, thereby accelerating their task learning. Turning to audio-visual sensing, we attempt to extract a conversation partner’s speech from competing background sounds or other human speakers. Finally, I will briefly preview a multi-institution large-scale egocentric video dataset effort.


Bio

Kristen Grauman is a Professor in the Department of Computer Science at the University of Texas at Austin and a Research Scientist in Facebook AI Research (FAIR).  Her research in computer vision and machine learning focuses on video, visual recognition, and embodied perception.  Before joining UT-Austin in 2007, she received her Ph.D. at MIT.  She is an IEEE Fellow, AAAI Fellow, Sloan Fellow, and recipient of the 2013 Computers and Thought Award.  She was inducted into the UT Academy of Distinguished Teachers in 2017.  She and her collaborators have been recognized with several Best Paper awards in computer vision, including a 2011 Marr Prize and a 2017 Helmholtz Prize (test of time award).  She served as an Associate Editor-in-Chief for the Transactions on Pattern Analysis and Machine Intelligence (PAMI) and a Program Chair of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015 and Neural Information Processing Systems (NeurIPS) 2018.


Language

English

On Mon October 25, 2021

Speaker

Naila Murray (Facebook / Research Scientist)


Title

Unsupervised Meta-Domain Adaptation for Instance Retrieval


Abstract

Cross-domain item retrieval naturally arises in a variety of applications, for example for question answering, or for querying online visual catalogs with consumer images. We focus on the cross-domain retrieval task and illustrate it with the latter scenario, that is, when unconstrained consumer images are used to query for fashion items in a collection of high-quality photographs provided by retailers. To perform this cross-domain task, approaches typically leverage both consumer and shop domains from a given dataset to learn a domain-invariant representation, allowing these images of different nature to be directly compared. When consumer images are not available beforehand, such training is impossible. In this talk, I describe a recent approach to this challenging and yet practical scenario, which leverages representations learned for cross-domain retrieval from another source dataset and adapts them to the target dataset for this particular setting.


Bio

Naila Murray obtained a BSE in electrical engineering from Princeton University in 2007. In 2012, she received her Ph.D. from the Universitat Autonoma de Barcelona, in affiliation with the Computer Vision Center. She joined Xerox Research Centre Europe in 2013 as a research scientist in the computer vision team, working on topics including fine-grained visual categorization, image retrieval and visual attention. From 2015 to 2019 she led the computer vision team at Xerox Research Centre Europe, and continued to serve in this role after its acquisition and transition to becoming NAVER LABS Europe. In 2019, she became the director of science at NAVER LABS Europe. In 2020, she joined Facebook AI Research as a research engineering manager. She has served as area chair for ICLR 2018, ICCV 2019, ICLR 2019, CVPR 2020, ECCV 2020, and program chair for ICLR 2021. Her current research interests include continual learning and multi-modal search.


Language

English

On Mon November 01, 2021

Speaker

Kristie Lu Stout (CNN / Anchor)


Title

Rethinking the Future


Abstract

The CNN anchor and correspondent first visited KAIST back in 2009 for the network’s week-long “Eye on South Korea” special to showcase the university's advances in electric transport and to meet Hubo, the KAIST humanoid robot, live on air. Over a decade on, Kristie Lu Stout returns to KAIST for a special interactive Webinar to reflect on her reporting and how technology offers hope in a damaged world.


Bio

Kristie Lu Stout is an award-winning anchor and correspondent for CNN, based in Hong Kong.She reports from the newsroom and in the field on major breaking news stories including US-China relations, Hong Kong's changing political landscape, the coronavirus pandemic, and the aftermath of extreme climate events in the region.She also hosts feature programs for CNN, most recently presenting Tech For Good, a television and digital series of intimate, inspiring, and transformative storiesfrom every corner of the globe. In her feature show Inventing Tomorrow, Lu Stout interviews the entrepreneurs, experts and businesses developing innovative ways to battle and navigate the COVID-19 health crisis.Based in China for two decades, Lu Stout maintains a focus on how developme nts in China are dramatically changing the world for all of us. From anchoring CNN's groundbreaking "Eye on China" series in 2004 to covering the Trump-Xi Summit in Beijing to reporting on fresh tensions with a new U.S. administration, Lu Stout has remained committed to reporting on the country. She was also instrumental in launching "On China," CNN'sfirst-ever regular series focused on the country --a first by any international TV news network.Her program "Inventing Tomorrow: Tech in a Time of Pandemic" was awarded Best Covid-19 Factual Feature at the 2020 Content Asia Awards. In 2020, she and her CNN Hong Kong colleagues were awarded Best Continuing News Reporting for TV and Video by the Association for International Broadcasting for team coverage of the 2019 Hong Kong Protests. In 2018, Lu Stout was awarded Best News or Current Affairs Presenter at the Asian Academy Creative Awards, while the news program she conceived and launched in 2010 --News Stream --was awarded Best News Program. Other accolades include multiple honors from the Asia Television Awards and Best News Coverage from the Royal Television Society in 2013 for coverage of Super Typhoon Haiyan in the Philippines.With over 850,000 fans and followers on social media, Lu Stout believes in engaging with her viewers and makes social media an integral part of her reporting. She also believes incontributing to the debate with viewers off-camera with regular speaking and moderating appearances and is well knownfor her style in stimulating frank and lively conversations.Lu Stout also plays an active role in promoting the CNN Freedom Project, the network's award-winning initiative focused on reporting stories of modern-day slavery, including student outreach inHong Kong and across Asia as part of #MyFreedomDay.Lu Stout started her career in journalism in San Francisco at WIRED magazine's online division. She has written on technology for various media publications including the South China Morning Post, whereshe founded and wrote the Beijing Byte column. Before her career in journalism, she was an early employee at Beijing-based Internet company Sohu.com and worked for Reuters' new media team in China.Lu Stout is a proud Asian American. She holds a bachelor's and a master's degree from Stanford University, and studied advanced Mandarin Chinese at Beijing's Tsinghua University.


Language

English

On Mon November 08, 2021

Speaker

Sumin Lee (Tomocube / Chief Scientist)


Title

생명과학자 이박사는 어쩌다가 3차원 홀로그래피 회사의 프로젝트 매니저가 되었을까


Abstract

QUANTITATIVE ANALYSIS OF CELL VOLUME AND THE DRY MASS IN A SINGLE CELL AND THE ORGANELLE DYNAMICS USING THREE-DIMENSIONAL QUANTITATIVE PHASE IMAGING
Quantitative Phase Imaging (QPI) provides label-free and real-time three-dimensional (3D) imaging capability. By using laser interferometry to measure 3-D refractive index (RI) distribution, 3D images of live cells without any molecular labeling can be obtained with high spatial resolution. Furthermore, QPI images can be analyzed to provide quantitative information of a single cell, including cell volume, dry mass, and protein concentration. In this seminar, we will discuss about the studies in quantitative cell biology by imaging the live cells with unprecedented correlative and quantitative bioimaging capabilities. We will also introduce representative studies which integrate artificial intelligent (AI) technologies for elucidating the characteristics based on the HT images and their quantitative information of individual mammalian cells.
[1] Park et al., Quantitative phase imaging in biomedicine, Nature Photonics 12, 578–589 (2018)
[2] Lee et al., Deep-learning based three-dimensional label-free tracking and analysis of immunological synapses of CAR-T cells, eLife 9:e49023 (2020)
[3] Choi et al., Label-free three-dimensional analyses of live cells with deep-learning-based segmentation exploiting refractive index distributions, BioRxiv 2021.05.23.445351 (2021)


Bio

Dr. Sumin Lee is the chief scientist and the team leader in Tomocube. Her team is running numerous research projects to explore biological applications using 3D quantitative phase imaging. Her main research interest is the quantitative analysis of individual cells and its organelle dynamics. She also has been managing the projects in product development including research instrument, in vitro diagnostic medical device, and 3D image analysis software. Dr. Lee received her Ph.D. degree from POSTECH in 2013 majoring Cellular Systems Biology. Until her post-doctoral training, she performed comparative studies of protein trafficking machineries to elucidate the evolution of organellar protein delivery mechanism in eukaryotic cells. From 2014 to 2017, She was a senior scientist in Division of Research Strategy Planning and Division of Forensic Genetics, National Forensic Service, Ministry of the Interior and Safety, Republic of Korea.


Language

Korean

On Fri November 19, 2021

Speaker

Kyunghyun Cho (NYU / Associate Professor, Genentech / Senior Director of Frontier Research)


Title

Rissanen Data Analysis - Machine Learning as Compression and Compression for Data Analysis


Abstract



Bio



Language

English

On Mon November 29, 2021

Speaker

Hwajung Hong (KAIST Dept. of Industrial Design / Associate Professor)


Title

UX Challenges for AI/ML products


Abstract

We already interact daily with instances of AI/ML. Design challenges come with integrating AI/ML solutions into day-to-day products and services. These challenges range from typical UX problems, such as explainability and user feedback, to greater social impacts and ethical implications, such as echo chamber and bias. In this talk, I will introduce the ways to handle AI/ML as new design material, discuss the complexities of designing AI/ML interactions, and propose a human-centered design framework for positive user experience in AI/ML products.


Bio

Hwajung Hong is an Associate Professor in the Department of Industrial Design at KAIST. She explores design from, with and by data to develop technologies that can amplify human value. She works to provide novel design methods for creating human-centered AI applications that respect various stakeholders' value including health, well-being, safety, diversity and equity. Before joining KAIST in 2021, she worked as an assistant professor at Seoul National University (2018-2021) and UNIST (2015-2018). Hwajung received her Ph.D. in Human-Centered Computing from Georgia Tech in 2015 and B.S. in Industrial Design from KAIST in 2009.


Language

English

On Fri December 10, 2021

Speaker

Hao Li (Pinscreen / CEO, UC Berkeley / Distinguished Fellow)


Title

AI Synthesis: From Avatars to 3D Scenes


Abstract

In this talk I will motivate how digital humans will impact the future of communication, human-machine interaction, and content creation. I will present our latest 3D avatar digitization technology from Pinscreen from a single photo, and give a live demonstration. I will also showcase how we use hybrid CG and neural rendering solutions for real-time applications used in next generation virtual assistant and virtual production pipelines. I will then present a real-time teleportation system that only uses a single webcam as input, and our latest efforts at UC Berkeley in real-time AI synthesis of entire scenes using NeRF representations.


Bio

Hao Li is CEO and Co-Founder of Pinscreen, a startup that builds cutting edge AI-driven virtual avatar technologies. He is also a Distinguished Fellow of the Computer Vision Group at UC Berkeley. Before that, he was an Associate Professor of Computer Science at the University of Southern California, as well as the director of the Vision and Graphics Lab at the USC Institute for Creative Technologies. Hao's work in Computer Graphics and Computer Vision focuses on digitizing humans and capturing their performances for immersive communication, telepresence in virtual worlds, and entertainment. His research involves the development of novel deep learning, data-driven, and geometry processing algorithms. He is known for his seminal work in avatar creation, facial animation, hair digitization, dynamic shape processing, as well as his recent efforts in preventing the spread of malicious deep fakes. He was previously a visiting professor at Weta Digital, a research lead at Industrial Light & Magic / Lucasfilm, and a postdoctoral fellow at Columbia and Princeton Universities. He was named top 35 innovator under 35 by MIT Technology Review in 2013 and was also awarded the Google Faculty Award, the Okawa Foundation Research Grant, as well as the Andrew and Erna Viterbi Early Career Chair. He won the Office of Naval Research (ONR) Young Investigator Award in 2018 and was named named to the DARPA ISAT Study Group in 2019. In 2020, he won the ACM SIGGRAPH Real-Time Live! “Best in Show” award. Hao obtained his PhD at ETH Zurich and his MSc at the University of Karlsruhe (TH).


Language

English