This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
retreats:2023fall:abstracts [2023/10/03 16:18] kilov created |
retreats:2023fall:abstracts [2023/10/11 22:50] (current) peziegler Changing my talk title and abstract |
||
---|---|---|---|
Line 1: | Line 1: | ||
=====Talk Abstracts: | =====Talk Abstracts: | ||
+ | |||
+ | ====Keynote==== | ||
+ | * Leveling up journalism with data science, Cheryl Phillips, Stanford | ||
+ | |||
+ | Abstract: | ||
+ | Machine-learning that identifies influence in The Supreme Court, building programs to identify problem doctors who are still practicing, building new methods to discover the patterns in police use of force cases. In this talk, Cheryl Phillips walks through some of ways journalism with impact is built on sophisticated data science and lays out the hardest technical challenges accountability and investigative journalists face now, including how to use generative AI in a way that produces reliable results, doesn’t break the bank and results in news stories with impact. | ||
+ | |||
+ | ====Session I==== | ||
+ | * Automated Reverse Engineering of Data Visualizations from In-the-Wild Examples, //Parker Ziegler// | ||
+ | Abstract: | ||
+ | Examples are foundational in helping data journalists author interactive graphics, whether by demonstrating challenging techniques or serving as building blocks for new design exploration. However, a key element of an example’s usefulness is the availability of its source code. If a data journalist wants to work from an “in-the-wild” example for which no source code is available, they have to resort to manual reverse engineering to produce an approximation of the original visualization. This is a time-consuming and error-prone process, erasing much of the original benefit of working from an example. In this talk, I’ll present our work on reviz, a compiler and accompanying Chrome extension that automatically generates parameterized data visualization programs from input SVG subtrees. I’ll walk through the reviz architecture from an end user’s perspective before diving deep into the internals of our reverse engineering and compilation processes. | ||
+ | |||
+ | * ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and Structured Data, //Liana Patel// | ||
+ | Abstract: | ||
+ | Increasingly, | ||
+ | |||
+ | |||
+ | ====Session II==== | ||
+ | * Describing Differences in Image Groups, //Lisa Dunlap// | ||
+ | Abstract | ||
+ | Reasoning about vast amounts of visual data is predominantly a human-centered task, making it the bottleneck of many data science and machine learning pipelines. In this work we explore the problem of automatically describing differences between sets of images with natural language. Our proposed method ImDiff utilizes descriptive captioning and large language models to propose concepts which are more present in one set of images than the other, verifying these hypotheses are grounded in the images using CLIP. We develop a suite of quantitative benchmarks to asses the correctness and relevance of described differences, | ||
+ | |||
+ | * Sketching with Code: Experiments in Programming with Natural Language, //J.D. Zamfirescu// | ||
+ | Abstract | ||
+ | LLMs' impressive capabilities in synthesizing and explaining code offer opportunities—and challenges—for human interactions with programs. Programming languages offer, critically, an unambiguous, | ||
+ | |||
+ | * Syntactic Code Search with Sequence-to-Tree Matching, //Gabriel Matute// | ||
+ | Abstract | ||
+ | Syntactic analysis tools like Semgrep and Comby leverage the structure in code, making them more expressive than traditional string and regex search. Meanwhile they also use a lightweight specification, | ||
+ | |||
+ | |||
+ | ====Session III==== | ||
+ | * Designing Reliable Human-AI Interactions with Retrieval Augmented Models, //Niloufar Salehi// | ||
+ | Abstract: | ||
+ | Machine learning models fail in unpredictable ways and many produce outputs that are difficult for users to verify, such as machine translation and code generation. Providing guidance on when to rely on a system is challenging because these models can generate a wide range of outputs (e.g. text), error boundaries are highly stochastic, and automated explanations may be incorrect. I will discuss this problem in the healthcare context where models trained on past data can be incredibly useful, but also challenging to use reliably. For instance, healthcare providers increasingly use machine translation (MT) for patients who do not speak the dominant language. However, MT systems can produce inaccurate translations.My work develops approaches to improve the reliability of ML models by designing actionable strategies for a user to gauge reliability and recover from potential errors. | ||
+ | |||
+ | * Bolt-on, Compact, and Rapid Program Slicing for Notebooks, //Shreya Shankar// | ||
+ | Abstract: | ||
+ | Computational notebooks are commonly used for iterative workflows, such as in exploratory data analysis. This process lends itself to the accumulation of old code and hidden state, making it hard for users to reason about the lineage of, e.g., plots depicting insights or trained machine learning models. One way to reason about code used to generate various notebook data artifacts is to compute a program slice, but traditional static approaches to slicing can be both inaccurate (failing to contain relevant code for artifacts) and conservative (containing unnecessary code for an artifacts). We present nbslicer, a dynamic slicer optimized for the notebook setting whose instrumentation for resolving dynamic data dependencies is both bolt-on (and therefore portable) and switchable (allowing it to be selectively disabled in order to reduce instrumentation overhead). We demonstrate nbslicer’s ability to construct small and accurate backward slices (i.e., historical cell dependencies) and forward slices (i.e., cells affected by the " | ||
+ | |||
+ | |||
+ | ====Session IV==== | ||
+ | * How Domain Experts Use an Embedded DSL, //Lisa Rennels// | ||
+ | Abstract: | ||
+ | | ||
+ | |||
+ | * The Role and Ramifications of Text: Chart Annotations Influence Bias Perception and Data Understanding, | ||
+ | Abstract: | ||
+ | Prior work shows that readers prefer charts that include annotations and explores how these text elements affect topic recall and conclusions. However, there' | ||
+ |