The goal of my research is to provide a theoretical and algorithmic framework for information science that can lead to efficient strategies for assessing, gathering, extracting, and exploiting information. In the era of data deluge, we want to fully utilize the large volumes and richness of data sets to efficiently infer the real-world phenomena behind the data. Information-theoretic concepts and tools are useful in data science, especially to establish fundamental limits and to explore trade-offs in extracting information from data sets. To deal with new challenges originated from practical concerns in engineering information processors for big data, we also need new techniques and concepts beyond the classical information-theoretic solutions. 

My research focus is on developing a theoretical framework for data science that copes with practical concerns such as timeliness in decision making, efficient usage of limited sensing resources, and computational efficiency in data processing. More specially, I study questions such as: How can we design sensing strategies to acquire the most relevant observations for estimating an unknown target variable at the lowest cost? How can we quantify value of information and develop strategies to extract the most valuable information given limited sensing resources? How can we design efficient information-recovery procedures from large amounts of noisy observations? How can we design distributed querying over crowd of unknown reliabilities to efficiently collect useful observations? I develop algorithms for these data acquisition and information recovery problems and provide performance guarantees for these algorithms by using tools from probability theory, information theory, and stochastic analysis.