Computer Science Department Dissertations
To obtain a copy of a dissertation listed here, contact the author directly.
| Name | Date | |
|---|---|---|
| Matthew Hammer | ||
| Title: Self-Adjusting Machines | ||
In computer systems, the interaction of computations and data often causes incremental changes, not wholesale ones. Hence, there exists the possibility of improving the efficiency of a system by recording and reusing computations, rather than blindly remaking them anew. To this end, self-adjusting computation is a technique for systematically constructing computational structures that evolve efficiently and incrementally. Past work has explored several facets of this programming language-based approach, but no prior formulations have given a low-level account of self-adjusting computation. By low-level, we mean an account where machine resources are defined and managed explicitly, e.g., as in the C programming language. We offer self-adjusting machines, a concept based on an operational interpretation of self-adjusting computation with explicit machine resources. By making their resources explicit, self-adjusting machines give an operational account of self-adjusting computation suitable for interoperation with low-level languages; via practical compilation and run-time techniques, these machines are programmable, sound and efficient. Abstractly, we formally define self-adjusting machines that run an intermediate language; we prove that this abstract machine semantics is sound. Concretely, we give techniques based on this semantics that construct self-adjusting machines by compiling a C-like surface language into C target code that runs within an extensible, C-based run-time system. By creating new programming abstractions, library programmers can extend this C-based system with new self-adjusting behavior. We demonstrate that this extension framework is powerful by using it to express a variety of both new and old self-adjusting abstractions. We give an empirical evaluation showing that our approach is efficient and well-suited for programming space and time-efficient self-adjusting computations. | ||
| Sravana Reddy | ||
| Title: Learning Pronunciations from Unlabeled Evidence | ||
The pronunciation of a word represented in an alphabetic writing system (such as this one) is relatively transparent -- but a language's sounds change over time and vary across space, while its spellings tend to remain static, resulting in some amount of divergence between the written and spoken forms. The introduction of loanwords and proper names from other languages with different phonologies and scripts further complicates the relationship between orthography and pronunciation. However, there are sources of information about the sound of a word besides spelling. Speech is the most natural example: a word's pronunciation is greatly clarified upon hearing it in a spoken utterance. In the case of proper names, knowing the linguistic or ethnic origin of the name is often instrumental in determining how it should be pronounced. Rhymes in poems or songs also provide a cue to pronunciation, which is particularly relevant for inferring the sound of a word at an earlier point in history. Predicting the pronunciations of words given their written representations is a problem encountered regularly by human speakers, who will often use secondary contextual information when the spelling by itself proves to be ambiguous. Building a system to generalize from a pronunciation dictionary and hypothesize the pronunciations of new words, often referred to as grapheme-to-phoneme modeling, is also an important engineering problem in speech technology, as well as in the machine translation of named entities. Using extra-orthographic evidence in a computational model for grapheme-to-phoneme conversion necessitates having the data that provides this information in sufficiently large quantities. Collecting annotated data -- speech labeled with the words that it contains, names with their ethnic origins, or poetry with rhyming patterns -- can be extremely difficult and expensive. On the other hand, unlabeled data -- speech recordings, lists of names, archives of poetry -- is available in plenty. This thesis presents methods for learning pronunciations of words using these forms of unlabeled evidence. | ||
| Morgan Sonderegger | ||
| Title: Phonetic and phonological dynamics on reality television | ||
The sounds of a language spoken by an individual, or shared by a speech community, can be seen as both remarkably stable and subject to great change. For example, one's accent intuitively seems very stable over adulthood, especially in comparison to one's constantly changing vocabulary; however, many people report that their accents shifted after moving from one city to another, or due to social pressure. This thesis addresses two questions about stability and change in sound systems, or phonetic and phonological dynamics: what are the dynamics of sound systems in individuals during adulthood, and what causes underlie them? Previous work has addressed these questions on two timescales. Short-term studies examine shifts in (phonetic and phonological) variables under exposure to the speech of others, e.g. over the course of a conversation. Long-term studies examine shifts in variables between time points separated by years. Short-term shifts are fairly robust, with most speakers showing some shift for most variables. Long-term shifts are extremely irregular, with huge variation in the amount of shift among speakers and variables. If short-term shifts accumulated in individuals, long-term shifts should be rampant -- what is the relationship between the patterns seen in short-term and long-term dynamics? And more generally, what do the dynamics of sound systems look like at any time scale in between? The bulk of the thesis is a medium-term case study addressing these questions, using a setting where day-by-day phonetics and phonological dynamics can be observed within individuals: the reality television show Big Brother UK, where speakers live in an isolated house for three months, and are continuously recorded. We consider 5 variables in 6 hours of speech from one season of the show: voice onset time (VOT), coronal stop deletion, and formant frequencies for three vowels. We build mixed-effect regression models of day-to-day time dependence for each variable, for each of 12 speakers, controlling for linguistic factors. Variability is the norm: speakers and variables show four qualitatively different types of time dependence, with a significant minority showing stability. There is some evidence that particular speakers (across variables) and particular variables (across speakers) show characteristic types of time dependence. Long-term time trends do sometimes occur, which could be due to accumulation of short-term shifts. Day-by-day variation is common, but far from universal. These results suggest a tentative account of the relationship between short-term and long-term dynamics, and directions for future work. The thesis also addresses two topics closely related to phonetic and phonological dynamics: synchronic variation, and automatic phonetic measurement. For each variable, we build a model of synchronic variation as a preliminary step to modeling the variable's dynamics within speakers; these models turn out to yield interesting and surprising findings with respect to previous work. Many questions of interest about phonetic and phonological dynamics require radically scaling up from the hand-labeled datasets used in most previous work, making automatic measurement methods crucial. Our main methodological contribution is a discriminative, large-margin algorithm for automatic VOT measurement, treated as a case of predicting structured output from speech. The algorithm is tested on data from four corpora representing different types of speech. It achieves performance near human intertranscriber reliability, and compares favorably with previous work. | ||
| Parinya Chalermsook | June 2012 | |
| Title: Approximation Algorithms for Integral Concurrent Flow, Independent Set of Rectangles, and Fire Containment Problems | ||
| First Position: Postdoc at IDSIA Institute (Switzerland) and Max-Planck Institute (Germany). In 2013-14, he will spend another year at Max-Planck. | ||
We study the approximability of three NP-hard combinatorial optimization problems that arise in various areas of computer science: Integral Concurrent Flow, Maximum Independent Set of Rectangles, and Fire Containment problems. In the first part, we study the integral counterpart of the classical Maximum Concurrent Flow problem, as well as its variants. In the basic version of this problem (basic-ICF), we are given an undirected graph G with edge capacities c(e), a subset T of vertices called terminals, and a demand D(t,t') for every pair (t,t') of the terminals. The goal is to find the maximum value \lambda, and a collection S of paths, such that every pair (t,t') of terminals is connected by \lambda D(t,t') paths in S, and the number of paths containing any edge e is at most c(e). We show a poly-logarithmic approximation algorithm that violates the edge capacities by only a constant factor. The key ingredient of our results is a new graph splitting technique which could be of independent interest. The results in this part are based on joint work with Julia Chuzhoy, Alina Ene, and Shi Li in STOC 2012. In the second part of the thesis, we turn to a geometric problem. Given a collection of rectangles, our goal is to select a non-overlapping sub-collection of maximum cardinality. This problem has many applications, as well as connections to other problems in various areas of computer science. The problem was introduced in 1997 together with a simple O(log n) approximation algorithm for it. Despite many attempts from different groups of researchers, the O(log n) upper bound remained the best known. We show an O(log log n) approximation algorithm. Our algorithm illustrates a connection between this problem and a half-century old rectangle coloring problem. The main result in this part is from a joint work with Julia Chuzhoy that appeared in SODA 2009. Finally, we focus on a resource minimization version of the fire containment problem in a standard mathematical model that is used to deal with spreading processes of fire and viruses. We show LP-rounding approximation algorithms for various classes of graphs and provide essentially tight lower bounds on the integrality gap of natural LP relaxations for these graph classes. Most results in this part were presented at SODA 2010 in a joint work with Julia Chuzhoy. | ||
| Ross Girshick | June 2012 | |
| Title: From Rigid Templates to Grammars: Object Detection with Structured Models | ||
| First Position: 2-year postdoc position with Prof. Jitendra Malik at UC Berkeley | ||
We develop models for localizing instances of a generic object category, such as cars or people, in images. We define these models using a grammar formalism. In this formalism compositional rules are used to encode models that can range in complexity from simple rigid templates to rich deformable part models with variable structure. A central contribution of this dissertation is an exploration along this axis, wherein we gradually enrich our object category representations. We demonstrate that these richer models lead to improved object detection performance on challenging datasets such as the PASCAL VOC Challenges. While building richer models, we would like to make use of existing training data and annotations. These annotations typically specify labels, such as object bounding boxes, that are "weak" compared to the derivation trees produced by detection with a grammar model. We propose a discriminative training method that directly supports learning models from weakly-labeled data. We show how to apply this formalism to the problem of learning the parameters of a grammar model. This approach results in a top-performing method for detecting people in images. In order to achieve widespread use in research and applications, an object detection system must not only be accurate, but also fast. Along the line of efficient computation, we develop a technique for "compiling" one of our object models into a much faster detector that implements a cascade architecture. We show how to select the cascade thresholds in a way that is both safe and effective. We demonstrate that the cascaded detector produces detections 15x faster than the non-cascade approach with no loss in precision or recall. | ||
| Joshua Grochow | June 2012 | |
| Title: Symmetry and equivalence relations in classical and geometry complexity theory | ||
| First Position: Postdoc at Univ of Toronto | ||
This thesis studies some of the ways in which symmetries and equivalence relations arise in classical and Geometric complexity theory. The Geometric Complexity Theory program is aimed at resolving central questions in complexity such as P versus NP using techniques from algebraic geometry and representation theory. The equivalence relations we study are mostly algebraic in nature and we heavily use algebraic techniques to reason about the computational properties of these problems. We first provide a primer on Geometric Complexity Theory, to provide perspective and motivate the other problems we study. We give improved algorithms to test matrix isomorphism of matrix Lie algebras, which is a problem that arises naturally in Geometric Complexity Theory. For certain cases of matrix isomorphism of Lie algebras we provide polynomial-time algorithms, and for other cases we show that the problem is as hard as graph isomorphism. To our knowledge, this is the first time graph isomorphism has appeared in connection with any lower bounds program. Finally, we study algorithms for equivalence relations more generally (joint work with Lance Fortnow). Two techniques are often employed for algorithmically deciding equivalence relations: 1) finding a complete set of easily computable invariants, or 2) finding an algorithm which will compute a canonical form for each equivalence class. Some equivalence relations in the literature have been solved efficiently by other means as well. We ask whether these three conditions---having an efficient solution, having an efficiently computable complete invariant, and having an efficiently computable canonical form---are equivalent. We show that this question requires non-relativizing techniques to resolve, and provide new connections between this question and factoring integers, probabilistic algorithms, and quantum computation. | ||
| Xueyuan Zhou | ||
| Title: Learning Functions on Unknown Manifolds | ||
As more and more complex data sources become available, the analysis of graph and manifold data has become an essential part of various sciences. In this thesis, learning a function through samples on a manifold is investigated. Toward this goal, several problems are studied. First, regularization in Sobolev spaces on manifolds are discussed, which can be used to find a smooth function on the data manifold. The regularizer is implemented by iterated Laplacian, which is a natural tool to study Sobolev spaces on manifolds. This way of regularization generalizes thin plate splines from regular grid on a known domain to random samples on an unknown manifold. Second, we study the asymptotic behavior of graph Laplacian eigenmaps method, which is the finite-dimensional approximation of an infinite-dimensional problem. The analysis shows that the method, as a nonparametric regression method, achieves the optimal integrated mean squares error rate in terms of the intrinsic dimensionality of the manifold. Third, the limit behavior of graph Laplacian on a manifold boundary and its implications are discussed. Finally, we studied a ranking on data manifold algorithm in information retrieval from a functional analysis point of view, which goes beyond the physics-based intuition. | ||
| Adam Shaw | ||
| Title: Implementation Techniques for Nested Data-Parallel Languages | ||
Nested data-parallel languages allow computation in parallel over irregular nested data structures. The classic approach to compiling nested data parallelism in high-level languages is to apply flattening to nested structures. Flattening separates nested data and its shape into distinct values: a flat data vector, and a representation of the nesting information. In a parallel context, flattening is beneficial because computation on flat data vectors maps easily onto parallel hardware, and it is easier to partition work across processing elements in flattened code. Traditionally, flattening is a wholesale transformation that unravels all nested data structures and correspondingly transforms the operations on them. Such total flattening may not always yield best performance: sometimes we might want to flatten part way, or not at all. To accommodate such possibilities, we present hybrid flattening. In hybrid flattening transformations, only certain structures are flattened, and to varying degrees. This dissertation presents a formal framework for defining hybrid flattening transformations. We use our framework to define a novel flattening transformation on a model programming language. Guided by our model, we implemented our transformation in the compiler for Parallel ML, a nested data-parallel language with implicitly-threaded features. Our implementation demonstrates the utility of the transformation. Across various benchmarks, transformed programs perform better than untransformed ones, scale better, and compete favorably against efficient sequential programs in C and SML. With our system, running PML programs on a 48-core machine yields as much as a thirtyfold improvement over their sequential counterparts. | ||
| Sonjia Waxmonsky | ||
| Title: Natural Language Processing for Named Entities with Word-internal Information | ||
In this thesis we present three projects that focus on natural language processing of named entities in text data, specifically for scenarios and domains where rare and out-of-vocabulary (OOV) words are problematic. First, we present work on the discovery of the language of origin of a named entity. This has applications in speech and language processing tasks, such as text-to-speech synthesis, since language of origin can help predict pronunciation of OOV words. Word origin recognition has also been studied for demographics and life sciences as a component in the collection of ethnicity data. Previous research has applied supervised machine learning methods to automate this task, but this requires a set of hand-labeled training data for each language represented in the model. Hand-labeled data may be expensive to acquire, and additionally the set of origin languages may not be known a priori. We consider how active learning can be applied to minimize the amount of manual annotation needed to train a successful supervised model. We also apply word origin modeling to grapheme-to-phoneme (G2P) conversion of US surnames, using both supervised and unsupervised approaches. Finally, we present work in biomedical text mining, where we examine named entity tagging of disease mentions in biomedical text. We extract morphology information from disease terms to be included as features in a Conditional Random Field model. We also show how biomedical disease terms can be decompounded into their component stem parts. Morphology information is acquired with the Linguistica toolkit for unsupervised learning of morphology, which has the advantage that a hand-segmented training set is not required for feature extraction. | ||
| Peter Brune | ||
| Title: Fast Numerical Methods and Biological Problems | ||
This thesis encompasses a number of efforts towards the development of fast numerical methods and their applications, particularly in light of the simulation of biochemical systems. Scientific computing as a discipline requires considerations from computer science; namely those of algorithmic efficiency and software automation. Also required is knowledge of applied mathematics in order to bridge the gap between computer science and specific application. This thesis spans these fields, with the study and implementation of optimal numerical techniques encompassing two chapters, and the development and application of numerical techniques to biochemical problems encompassing another two. The first of these efforts is the construction of robust, optimal geometric unstructured multigrid methods in the face of difficult problem and mesh conditions. The second was the construction of optimal discrete function spaces for problems arising in quantum mechanics and electronic structure calculations. The third and fourth were the development of fast and flexible methods for nanoscale implicit solvent electrostatics. The development of fast and parallel methods for an important quantity of interest in classical density functional theory calculations is discussed. Also, the derivation and implementation of a finite element method for improved solvent models using automated scientific computing tools is described. | ||
| Paolo Codenotti | ||
| Title: Testing Isomorphism of Combinatorial and Algebraic Structures | ||
Our main results are algorithms to test isomorphism of hypergraphs and of groups. We give an algorithm to test isomorphism of hypergraphs of rank $k$ in time $\exp(\tilde{O}(k^2\sqrt{n}))$, where $n$ is the number of vertices (joint work with L. Babai, FOCS 2008). (The rank is the maximum size of hyperedges; the tilde refers to a polylogarithmic factor.) The case of bounded $k$ answered a then 24-year-old question and removes an obstacle to improving the worst-case bound for Graph Isomorphism testing. The best previously known bound, even for $k=3$, was $C^n$ (Luks 1999). We consider the problem of testing isomorphism of groups of order $n$ given by Cayley tables. The easy $n^{\log n}$ bound on the time complexity for the general case has not been improved over the past four decades. We demonstrate that the presence of abelian normal subgroups is the only obstacle to efficient algorithms: our main result is that ``semisimple groups,'' defined as groups without abelian normal subgroups, admit polynomial-time isomorphism testing (joint work with L. Babai, J. A. Grochow, Y. Qiao, SODA 2011, and L. Babai, Y. Qiao, in preparation). The key ingredients are (a) a study of the group theoretic structure of semisimple groups; (b) an algorithm to test permutational isomorphism of permutation groups in time polynomial in the order and exponential in the degree; (c) an algorithm for the ``twisted code equivalence problem,'' which generalizes the code equivalence problem by adding a group action on the alphabet. The problems in (b) and (c) are of interest in their own right. | ||
| Nick Trebon | ||
| Title: Enabling Urgent Computing within the Existing Distributed Computing Infrastructure | ||
This dissertation defines the domain of urgent computing as the set of deadline-constrained scientific simulations and models that are used to affect some critical decision (e.g., severe weather forecasting, the modeling of infectious diseases, wildfire simulations, patient-specific medical treatment). These computations typically target large, distributed systems (e.g., supercomputers, clusters, Grids, clouds) that are shared by many users. This resource contention often leads to nontrivial and highly variable allocation delays that impede the ability of the computations to meet their deadlines. Thus, resources that support urgent computing should provide elevated priority access to these urgent computations. This dissertation focuses on the additional infrastructure necessary to aid these critical computations in meeting their deadlines within the current distributed computing environment. In particular, the combination of management mechanisms, access policies and tools needed to support urgent computing on batch queue resources (e.g., supercomputers and clusters) and computational clouds. This dissertation describes the implementation of the Special Priority and Urgent Computing Environment (SPRUCE), a token-based framework that manages urgent computing users, sessions, and resources. This framework has been widely deployed on the TeraGrid. Additionally, this dissertation proposes and evaluates a set of elevated priority policies that different resource types (i.e., batch queue resources, cloud resources) can use to support urgent computing. Included in this evaluation is a discussion of the effect these policies have on nonurgent users of the resource, along with potential compensation schemes for affected users. This dissertation also identifies and evaluates a set of statistical methods and heuristics that can be used to predict probabilistic upper bounds on the total turnaround time of the urgent computation (i.e., file staging, resource allocation, and execution delay). These upper bounds can guide the user in selecting the resource and policy that offers the greatest probability of meeting the given deadline. | ||
| Ozgur Sumer | March 2012 | |
| Title: Adaptive Inference for Graphical Models | ||
Many algorithms and applications involve repeatedly solving variations of the same inference problem; for example, to introduce new evidence to the model or to change conditional dependencies as used in dual-decomposition methods. As the model is updated, the goal of adaptive inference is to take advantage of previously computed quantities to perform inference more rapidly than from scratch. This task is especially critical for dual-decomposition methods as they decompose a complex model into a collection of simpler subproblems (such as trees) and iteratively solve them by changing the model parameters at each iteration. In this thesis, we first present algorithms for adaptive exact inference on general graphs that can be used to efficiently compute marginals and update MAP configurations under arbitrary changes to the input factor graph and its associated elimination tree. We then apply this methodology to dual-decomposition approximate inference. The key to our approach is a linear time preprocessing step which builds a data structure called cluster tree that can efficiently be maintained when the underlying model is slightly modified. We demonstrate how cluster tree can be used to compute any marginal in time that is logarithmic in the size of the input model. Moreover, in contrast to max-product our approach can also be used to update MAP configurations in time that is roughly proportional to the number of updated entries, rather than the size of the input model. This fact enables us to use our framework to speed up the convergence of the dual-decomposition methods particularly when combined with the high degree of parallelism which cluster tree is easily amenable to. To evaluate the practical effectiveness of our algorithms, we implement and test them using synthetic data as well as for three real-world applications: protein secondary structure prediction, minimum-energy conformation of a protein and stereo matching. Our experiments show that adaptive exact inference can achieve substantial speedups over performing complete inference as the model undergoes small changes over time. In the case of dual-decomposition methods, this translates to achieving significant improvement in convergence time. | ||
| Duru Turkoglu | March 2012 | |
| Title: Stable Algorithms and Kinetic Mesh Refinement | ||
There has been an increasing interest in developing dynamic and kinetic algorithms as real world applications require computing certain properties of a given input as it is modified because of insertions and deletions (dynamic) or because of continuous motion (kinetic). Considering the static version of the dynamic and kinetic problems when the input is not allowed to change, in many instances, the static version of a problem is well understood while the dynamic and/or kinetic versions are much harder to develop. In order to expand our intuition on algorithms that are required to handle several types of modifications, including dynamic and kinetic, Acar et al. recently proposed an alternative framework called self-adjusting computation. Given an algorithm that solves the static version of a dynamic or kinetic problem, the principal algorithmic technique of the self-adjusting computation framework, called change propagation, utilizes this static algorithm to generate an update algorithm for the same problem. The effectiveness of the update algorithm generated by this approach directly depends on the stability of the static algorithm: an algorithm is stable if its executions with two similar inputs produces outputs and intermediate data that are different only by a small fraction. In this thesis, we utilize the stability approach and design update algorithms inspired by change propagation for solving certain dynamic and kinetic problems in computational geometry. One of the most interesting of these problems is the problem of kinetically maintaining a mesh of continuously moving points; we provide a first result on this long standing open problem. All the results we get in this thesis indicate that the stability approach in designing dynamic and kinetic algorithms alleviate the complexity of the update algorithms and transfer the inherent complexity to the stable design and analysis of a static solution in order to obtain effective update algorithms. | ||
| Gulriz Kurban | ||
| Title: Computational Analysis of Protein Modular Domain Architectures | ||
This dissertation presents novel computational methods to identify domains in multi-domain protein sequences. These methods predict a protein's modular domain architecture (MDA) from its local alignments with a large set of proteins that contain one or more homologous domains. The first method builds a probabilistic model that partitions the protein positions into domains based on observed alignments, giving us a preliminary prediction for the protein's MDA. The second method incorporates the likelihood of switching between domains to improve the prediction accuracy. It extends the initial model to take into account the linker propensities for individual amino acids in each position and the adjacency of the positions. The end product of my research is a tool that facilitates further experimental and computational analyses that require the preliminary knowledge of protein domain positions, such as protein structure determination and functional annotation. | ||
| Gohar Margaryan | ||
| Title: PARALLEL MULTI-OBJECTIVE OPTIMIZATION ALGORITHM FOR DE NOVO DESIGN OF TARGET FOCUSED LIGAND LIBRARIES | ||
Rational drug design is a process of finding new medications based on the knowledge of the biological target. Candidate molecules can fail to become drugs for various reasons such as poor absorption or toxicity. A successful drug must therefore satisfy multiple, sometimes competing objectives. In this dissertation we develop a computational tool for automated design of novel small organic molecules that will potentially exhibit good activity against a desired drug target. A parallel genetic multi-objective algorithm acts on molecules to mutate and recombine molecular fragments and a novel neural network based scoring function for estimating binding affinity to a drug target guides the search in the molecular space towards molecules optimized to bind well to the target. A second objective function calculates similarity of the prospective ligand to known drugs to guide the search towards molecules exhibiting other drug-like properties. The method is tested on several drug targets and a comparative study between our method and another common technique used in computer-aided drug design (virtual screening of an existing chemical database) is conducted. Our findings demonstrate that generally, the set of molecules obtained by our method is better optimized for the imposed objectives. | ||
| Siwei Wang | August 2010 | |
| Title: Improving Tone Recognition with Nucleus Modeling and Sequential Learning | ||
Mandarin Chinese and many other tonal languages use tones that are defined as specific pitch patterns to distinguish syllables otherwise ambiguous. It had been shown that tones carry at least as much information as vowels in Mandarin Chinese [Surendran et al., 2005]. Surprisingly, though, many speech recognition systems for Mandarin Chinese have not performed explicit tone recognition. Systems which had initially tried to incorporate tone information by recognizing tonal syllables often found that errors at the tone recognition stage outweighed any benefit from the percentage of correctly recognized tones. Instead these systems recognized toneless syllables and used large-scale language models for disambiguation. We attribute these problems of earlier tone recognition approaches to several limitations of their tone modeling strategies which my approaches aims to correct. The first limitation is that most prior tone recognition approaches did not model the effect of neighboring syllables in continuous speech, known as coarticulation, that often severely distorts the realizations of the underlying tones. We proposed two nucleus modeling techniques to locate the most reliably produced portion of the syllable and remove the regions affected by coarticulation. The first method manipulates a geometric representation of both pitch and amplitude to locate the tone nucleus as the region with the greatest articulatory effort. The second method models tone nuclei based on the likelihood output of a landmark-based vowel detector, linking the sonority profile of the speech signal to the stability of tone production. Tone recognition using both nucleus modeling techniques outperforms tone recognition using either baseline nucleus approaches or the full syllable. Furthermore, the second landmark-based nucleus modeling shows a 15% improvement over two published tone recognition frameworks. The second limitation is that most of the earlier approaches provided limited or no modeling on the tone variation due to phrase, sentence and topic level intonation. Most of the previous modeling techniques aimed at modifying the pitch levels and ranges of tones to adjust for the impact of these intonation events. However, we demonstrate that none of these approaches adequately compensates for the boundary effects of these intonational units. Our approach employed sequential learning frame- works to investigate their modeling on tone production with intonation events. Our approach employs sequential learning frameworks to model the effect of the broader intonation context on tone realization and recognition. Our analysis also shows that all three factors: tone identity, level of intonation event and structure of sequential graphical model can influence the recognition performance. This is because: First, different tones show different pitch patterns across phrase, sentence and story level boundaries; Second, the structures of sequential graphical model make them encode these variations differently. BIBLIOGRAPHY Dinoj Surendran, Gina-Anne Levow, and Yi Xu. Tone recognition in mandarin using focus. Proceedings of Interspeech/ICSLP 2005, 2005. | ||
| Michael Rainey | August 2010 | |
| Title: Effective scheduling techniques for high-level parallel languages | ||
In the not-so-distant past, parallel programming was mostly the concern of programmers specializing in high-performance computing. Nowadays, on the other hand, many of today's desktop and laptop computers come equipped with a species of shared-memory multiprocessor called a multicore processor, making parallel programming a concern for a much broader range of programmers. Declarative languages, such as Parallel ML (PML) and Haskell, seek to reduce the complexity of programming multicore processors by providing programmers with abstract execution models, such as implicit threading, where programmers annotate their programs to suggest the parallel decomposition. Implicitly-threaded programs, however, do not specify the actual decomposition of computations or mapping from computations to cores. The annotations act simply as hints that the language implementation can ignore. The parallel decomposition itself is the responsibility of the runtime system and, more specifically, of the scheduling system. Threads can take arbitrarily different amounts of time to execute, and these times are difficult to predict in advance. Implicit threading urges the programmer to divide the program into threads that are as small as possible because doing so increases the flexibility the scheduler in its duty to distribute work evenly across processors. The downside of such fine-grain parallelism is that the total scheduling cost can be significant. It is quite possible that, for a given thread, the scheduling costs outweigh the benefits of parallelism. This problem is the focus of this dissertation. An effective runtime system is both scalable and robust. A runtime system is scalable if performance, in terms of execution time, improves in proportion to the number of processing elements. A runtime system is robust when performance is consistently good under changing conditions, e.g., a change of input data set, number of processors, or different program structure. Robust runtime systems are predictable across programs generally, not just those tuned for a particular set of conditions. In this dissertation, I present two new techniques, Lazy Promotion and Lazy Tree Splitting, for increasing the effectiveness of runtime systems.Lazy Promotion is a strategy that improves the performance of a work-stealing scheduler by reducing the amount of load the scheduler places on the garbage collector. Lazy Tree Splitting is a technique for automatically scheduling the execution of parallel operations over trees to yield scalable performance and eliminate the need for per-application tuning. I use Manticore, PML's compiler and runtime system, and a 16-core NUMA machine as a testbed for these techniques. In addition, I present two empirical studies. In the first study, I evaluate Lazy Promotion over three representative PML benchmarks. The results demonstrate that Lazy Promotion either outperforms or performs the same as an alternative scheme based on Eager Promotion. This study also evaluates the design of the Manticore runtime system, in particular, the split-heap memory manager, by comparing the system to an alternative system based on a unified-heap memory manager, and showing that the unified version has limited scalability due to poor locality. In the second study, I evaluate Lazy Tree Splitting over seven PML benchmarks by comparing Lazy Tree Splitting to its alternative, Eager Tree Splitting. The results show that, although the two techniques offer similar scalability, only Lazy Tree Splitting is suitable as a component of a robust system. | ||
| Borja Sotomayor | August 2010 | |
| Title: Provisioning Computational Resources Using Virtual Machines and Leases | ||
| First Position: Scientific Writer, Computation Institute, University of Chicago & Argonne National Laboratory | ||
The need for computational resources has, over the years, become a fundamental requirement in both science and industry. In many cases, this need is transient: a user may only require computational resources for the duration of a well-defined task. For example, a scientist could require a large number of computers to run a simulation for just a few hours, but might not need those computers at any specific time (as long as they are made available in a reasonable amount of time). A college instructor may want to make a cluster of computers available to students during the course's lab sessions, at very specific times during the week, and with a specific software configuration. A telecommunications company could posses an existing infrastructure that hosts a number of websites, but may need to supplement that infrastructure with additional resources during periods of unforeseen increased web traffic, meaning those resources have to made available right away with very little advance notice. These transient resource usage scenarios pose the problem of how to provision shared computational resources efficiently. This problem has been studied for decades, resulting in approaches that tend to be highly specialized to specific usage scenarios. For example, the problem of how to run multiple jobs on a shared cluster has been extensively studied, resulting in job management systems systems like Torque/Maui, Sun Grid Engine, LoadLeveler, and many others, that can queue and prioritize job requests efficiently (in these systems, efficiency is defined in terms of a variety of metrics, including waiting times and resource utilization). Such a system would meet the requirements of the scientist wanting to run simulations during a few hours but, on the other hand, the college instructor and the telecommunications company mentioned above would be ill-served by a job management system and the efficiency metrics typically used in job management. Conversely, other resource provisioning approaches are not particularly well suited for job-oriented computations (this point will be explored in greater detail throughout this chapter). Thus, there is no general solution that can provision resources meeting the requirements of different usage scenarios simultaneously, such as those mentioned above, reconciling the different measures of efficiency in each scenario. More specifically, much of my work is motivated by the combination of best-effort resource requirements, where a user needs computational resources but is willing to wait for them (possibly setting a deadline), and advance reservation resource requirements, where the resources must be available at a specific time. In the former, efficiency is typically measured in terms of waiting times (or similar metrics such as turnaround times or slowdowns) or throughput, while the latter is usually concerned with providing the requested resources at exactly the agreed-upon times without interruption, and both are concerned with maximizing the use of hardware resources and possibly monetary profit. Although both best-effort and advance reservation provisioning have been studied separately, the combination of both is known to produce utilization problems and is discouraged in practice. In this dissertation I seek to develop a resource provisioning model and architecture that can support multiple resource provisioning scenarios efficiently and simultaneously, with an initial focus on the best-effort and advance-reservation cases mentioned above, and arguing in favour of a lease-based model, where leases are implemented as virtual machines (VMs). The main contribution of this dissertation are: 1. A resource provisioning model and architecture that uses leases as a fundamental abstraction and virtual machines as an implementation vehicle. 2. Lease scheduling algorithms that mitigate the utilization problems typically encountered when scheduling advance reservations. 3. A model for the various overheads involved in using virtual machines, and algorithms that (a) allow lease terms to be met even in the presence of this overhead, and (b) mitigate this overhead in some cases. 4. Price-based policies for lease admission, showing that an adaptive pricing strategy can, in some cases, generate more revenue than other baseline pricing strategies, but does so by using fewer resources, thus giving resource providers more excess capacity that can potentially be sold to other users. As a technological contribution, I also present Haizea (http://haizea.cs.uchicago.edu/), an open source reference implementation of the architecture and algorithms described in this dissertation. | ||
| Yingqi Xiao | August 2010 | |
| Title: Abstract Trace Analysis for Concurrent Programs | ||
Program analysis plays an important role in modern software. With the advent of cheap parallel processing on the desktop(and laptop), concurrency is increasingly widely used in software. Many concurrent languages provide powerful concurrency primitives, such as dynamic creation of processes, dynamic creation of first class channels and communication over those channels. These primitives provide more flexibility of concurrent programing to programs. But at the same time, they make programs more complex and more non-deterministic. The consequence is that designing and implementing static analysis for such programs become more important and challenging. One fundamental challenge of static analysis for concurrent programs is that deciding if an program point is reachable in the presence of nondeterminism is undecidable. To solve this problem, we have to provide safe approximation of the set of possible run-time behaviors so that we can compute satisfying properties of programs while without losing too much precision. The goal of this thesis is to investigate techniques of static analysis of dynamic communication of concurrent programs and present a new approach to analyze concurrent programs. In this thesis, we present a new approach to compute a precise approximation of all possible control paths that reach each program points. We call the approach abstract trace analysis. The approximation is precise because it only considers valid control paths that functions return to where they get called. We also demonstrate how to do powerful analysis based on abstract trace analysis. We show how to analyze communication topology based on abstract trace analysis. The analysis can extract channel usage information from abstract trace, distinguish different abstract dynamic instances of same channels and generate more precise output than our previous modular work. In this thesis, we also show how to extract concurrency information from abstract trace and use it to do optimization by detecting and transforming monitor threads into more efficient monitor functions. | ||
| George Kuan | June 2010 | |
| Title: A True Higher-Order Module System | ||
The ML module system is a powerful and expressive language for modular programming and enforcing data abstraction. Several dialects of ML have extended the module system with support for higher-order modules, which improves support for modular programming and elevates the module system to a full functional language. With the exception of Standard ML of New Jersey and MLton, higher-order modules in all these dialects do not obey natural beta-reduction semantics for higher-order functor application (true higher-order semantics). The design space and semantics of a true higher-order module system have not been thoroughly explored. Most of the existing formal accounts of module system semantics neglect true higher-order semantics by separating higher-order functors from type generativity, which limits the flexibility of higher-order functors. The accounts which consider higher-order module semantics neglect to account for interactions between higher-order modules and key core language features such as generative datatype declarations. True higher-order semantics also paradoxically complicates true separate compilation. In this dissertation, I contribute (1) a novel formal account of a module system true higher-order semantics, (2) a static entity calculus that cleanly isolates and expresses the higher-order semantics, (3) an exploration of the design space of higher-order module semantics including true separate compilation and the signature calculus. | ||
| Raghav Kulkarni | August 2010 | |
| Title: Computational Complexity: Counting, Evasiveness, and Isolation | ||
| First Position: Postdoctoral Fellow, Laboratory of Research in Informatics (LRI) - University of Paris 7, Paris. | ||
We present results in three different topics of Theoretical Computer Science, namely Complexity of Counting, Decision Tree Complexity, and Derandomization of the Isolation Lemma (in restricted cases) via Planarity. We will present one representative result from each topic: (Counting) Permanent mod $2^k$ is as easy as Determinant mod $2^k.$ More specifically, for any fixed $k,$ Permanent mod $2^k$ is complete for $\oplus L$ (parity L). (Evasiveness) Any monotone property of sparse graphs is evasive, i.e., one must query all the edges in worst case. (Isolation) Efficient deterministic Isolation for bipartite planar graphs is possible whereas the same for non-bipartite planar graphs would have non-trivial consequences. | ||
| Andy Terrel | August 2010 | |
| Title: Finite element method automation for non-Newtonian fluid models | ||
| First Position: Postdoctoral fellow, Texas Advanced Computing Center, University of Texas at Austin | ||
Finite element methods (FEMs) are a popular simulation technique for discretizing partial differential equations (PDEs) to an algebraic problem. For well studied problems, theory shows which methods are optimal and can be applied to a wide variety of applications. Building a FEM model for a new or complex problem can be a labor intensive, error prone task. The difficulties have been somewhat alleviated by automation, but the tools often lag in performance to hand optimized, specialized code. Because of the ease of specifying new models and solver techniques in automated solvers, it is an ideal place to look at challenging problems and evaluate a large number of methods. By developing automation techniques, we study a challenging class of problems: complex fluids. | ||
| Fangfang Xia | March 2010 | |
| Title: Inference of Evolutionary Trees and Horizontal Gene Transfer | ||
| First Position: Computational postdoctoral fellow at Argonne National Laboratory | ||
As sequence data has accumulated, emphasis in evolutionary molecular biology has been placed on identifying homologous gene families, and from gene families, reconstructing evolutionary histories for species. However, evolutionary processes such as gene duplication, gene loss and horizontal gene transfer can result in incongruence between gene and species trees. In this thesis, we present algorithms for tree reconciliation and species tree reconstruction both within a parsimony framework and a likelihood framework. Our focus will be on the likelihood model that integrates sequence evolution, macroevolutionary processes and a relaxed molecular clock for substitution rates. We validate our algorithms by simulation studies and we demonstrate their utility in case studies with real sequence data to estimate rates of evolutionary events, infer subsystem and species phylogenies, and detect horizontal gene transfer. | ||
| Hariharan Narayanan | August 2009 | |
| Title: Diffusion in Computer Science and Statistics | ||
| First Position: Postdoctoral Fellow, Laboratory for Information and Decision Systems, Massachusetts Institute of Technology | ||
In this thesis, we investigate diffusion as an algorithmic and analytic tool in statistics and computer science. We address a question arising from computational linguistics, where we wish to understand the behavior of a network of agents modeled as nodes of a graph that adaptively modify their lexicon using data from their neighbors. By introducing a model of memory and a family of coalescing random walks, we prove that they eventually reach a consensus with probability 1. We devise a distributed algorithm using a diffusion process having two time scales. Addressing the question of routing in a network, we use steady-state diffusions corresponding to electrical flow in a network of resistors for oblivious routing and prove that this scheme performs well under a variety of performance measures. Based on a microscopic view of diffusion as an ensemble of particles executing independent Brownian motions, we develop the fastest currently known algorithm for computing the area of the boundary of a convex set. A similar technique is used to produce samplers for the boundaries of convex sets and smooth hypersurfaces that are the boundaries of open sets in Euclidean space, assuming access to samplers for the interior. These algorithms are motivated by Goodness-of-Fit tests in statistics. The half plane capacity, a quantity used to parameterize conformally invariant stochastic processes known as Schramm-Loewner evolutions, is shown to be comparable to a more geometric notion defined in terms of the areas of neighborhoods. Schramm-Loewner evolutions are important in statistical physics. We analyze a class of a natural random walks on a Riemannian manifold, and give bounds on the Mixing times in terms of the Cheeger constant and a notion of smoothness that relates the random walk to the metric. A Markov chain having a stationary distribution that is uniform on the interior of a polytope is developed. This is the first chain whose mixing time is strongly polynomial when initiated in the vicinity of the center of mass. This Markov chain can be interpreted as a random walk on a certain manifold. The resulting algorithm for sampling polytopes outperforms known algorithms when the number of constraints is of the same order of magnitude as the dimension. We use a variant of this Markov chain to design a randomized version of Dikin's affine scaling algorithm for linear programming. We provide polynomial-time guarantees which do not exist for Dikin's algorithm. Addressing a question from machine learning, under certain smoothness conditions, we prove that a form of weighted surface area is the limit of the weight of graph cuts in a family of random graphs arising in the context of clustering. This is done by relating both to the amount of diffusion across the surface in question. Addressing a related issue on manifolds, we obtain an upper bound on the annealed entropy of the collection of open subsets of a manifold whose boundaries are well conditioned with respect to a bounded density. This result leads to an upper bound on the number of random samples needed before it is possible to accurately classify data lying on a manifold. | ||
| Ioan Raicu | March 2009 | |
| Title: Many-Task Computing: Bridging the Gap between High Throughput Computing and High Performance Computing | ||
| First Position: NSF/CRA Computing Innovation Fellow, Department of Electrical Enginerring and Computer Science, Northwestern University | ||
Many-task computing aims to bridge the gap between two computing paradigms, high throughput computing and high performance computing. Many task computing is reminiscent to high throughput computing, but it differs in the emphasis of using many computing resources over short periods of time to accomplish many computational tasks, where the primary metrics are measured in seconds (e.g. floating point operations per second, tasks per second, input or output rates per second), as opposed to operations per day or month (e.g. jobs per month). Many task computing denotes high-performance computations comprising of multiple distinct activities, coupled via file system operations. Tasks may be small or large, uniprocessor or multiprocessor, compute-intensive or data-intensive. The set of tasks may be static or dynamic, homogeneous or heterogeneous, loosely coupled or tightly coupled. The aggregate number of tasks, quantity of computing, and volumes of data may be extremely large. Many-task computing includes loosely coupled applications that are generally communication-intensive but not naturally expressed using message passing interface commonly found in high performance computing, drawing attention to the many computations that are heterogeneous but not “happily” parallel. This dissertation explores fundamental issues in defining the many-task computing paradigm, as well as theoretical and practical issues in supporting both compute and data intensive many-task computing on large scale systems. We have defined an abstract model for data diffusion, have defined scheduling policies with heuristics to optimize real world performance, and developed a competitive online caching eviction policy. We also designed and implemented the necessary middleware – Falkon – to enable the support of many-task computing on clusters, grids and supercomputers. Falkon, a Fast and Light-weight tasK executiON framework, addresses shortcomings in traditional resource management systems that support high throughput and high performance computing that are not suitable or efficient at supporting many-task computing applications. Falkon was designed to enable the rapid and efficient execution of many tasks on large scale systems (i.e. through multi-level scheduling and streamlined distributed task dispatching), and integrate novel data management capabilities (i.e. data diffusion which uses data caching and data-aware scheduling to exploit data locality) to extend data intensive applications scalability well beyond that of traditional shared or parallel file systems. As the size of scientific data sets and the resources required for their analysis increase, data locality becomes crucial to the efficient use of large scale distributed systems for data-intensive many-task computing. We propose a “data diffusion” approach that acquires compute and storage resources dynamically, replicates data in response to demand, and schedules computations close to data. As demand increases, more resources are acquired, allowing faster response to subsequent requests that refer to the same data, and as demand drops, resources are released. This approach provides the benefits of dedicated hardware without the associated high costs, depending on workload and resource characteristics. Micro-benchmarks have shown Falkon to achieve over 15K+ tasks/sec throughputs, scale to millions of queued tasks, and to execute billions of tasks per day. Data diffusion has also shown to improve applications scalability and performance, with its ability to achieve hundreds of Gb/s I/O rates on modest sized clusters. Falkon has shown orders of magnitude improvements in performance and scalability across many diverse workloads (e.g heterogeneous tasks from 100ms to hours long, compute intensive, data intensive, varying arrival rates) and applications (e.g. astronomy, medicine, chemistry, molecular dynamics, economic modeling, and data analytics) at scales of billions of tasks on hundreds of thousands of processors across clusters, Grids (e.g. TeraGrid), and supercomputers (e.g. IBM Blue Gene/P and Sun Constellation). | ||
| Michael Papka | ||
| Title: Visualization and Collaboration Technologies to Support High-Performance Computing Research | ||
High performance computing has become an increasingly important fixture in science, from aiding in the processing of data collected in experiments, to acting as a virtual laboratory in which experiments are done. Thus, high performance computing is creating a third branch of scientific effort. This trend has driven research and development in a variety of different areas from fundamental hardware design to the software that makes the resources useful. With each iteration of this development cycle computational science has become more and more complex. This effort addresses this complexity in two key interrelated areas: visualization and collaboration. Visual representation is the key method to simplify the explanation of a complex environment. Consequently, a large research and development community effort has grown to support scientific visualization. It has also spawned research in advanced displays to provide infrastructure for exploring data products. These include immersive displays like the CAVE Automatic Virtual Environment or high resolution displays constructed of multiple individual units like the ActiveMural. This work includes influential contributions in all of these areas. At the same time, complex tasks are often simplified by effort sharing. We see that the teams of individuals working together to do this new form of science have become larger and more distributed. Research efforts in collaboration technology have grown to address this problem. Here we describe the Access Grid and tools built for sharing information as part of this effort. As will be seen in this thesis, collaboration technology both relies on visualization technology and supports it in enabling interactions at a distance. Throughout this work we have taken a user driven iterative approach using real applications from a variety of scientific domains. This end-to-end testbed approach guarantees realistic experimental circumstances with real world stresses and constraints. The main contributions of this dissertation are: a) discovery of requirements for the connecting of collaboration and visualization technology to high performance computing; b) development of infrastructure and demonstrations for enabling coupled advanced displays and high performance resources, including the first remote connection of two spatially immersive virtual environments (CAVE to CAVE); and c) development of infrastructure for efficient pixel transport using commodity video codecs to support collaborative scientific visualization. | ||
| Leandro Cortés | ||
| Title: Detection and Tracking of Multiple Objects in Fluorescence Microscopy | ||
Researchers in biology regularly produce large data sets of noisy images and videos that contain hundreds of fluorescent objects interacting in a cluttered background. This dissertation presents a statistical framework for analyzing images and videos of this kind. Specifically we analyze images that contain hundreds of overlapping polonies and videos of hundreds of vesicles and tubular organeles produced by total internal reflection fluorescent microscopy. We approach the problem of detecting and tracking multiple fluorescent objects by defining statistical data models for individual objects and background, with clear rules to compose them in an image. A statistical model for the data allows us to formulate well defined hypotheses and properly weigh them on-line. The computational challenge of object detection is addressed by defining a sequence of coarse-to-fine tests, derived from the statistical model, to quickly eliminate most candidate locations for the objects. The computational load of the tests is initially very low and gradually increases as the false positives become more difficult to eliminate. Only at the last step, state variables are estimated from a complete time-dependent model. Processing time thus mainly depends on the number and size of the objects in the image and not on image size. The main contributions of this dissertation are: (a) a general statistical model for image and video data presenting multiple fluorescent objects such as polonies, vesicles and tubular objects, (b) the use of these models to derive coarse-to-fine stable algorithms for efficient detection and tracking of overlapping objects (c) the derivation of simple tests to identify types of vesicle dynamics, and (d) two end-user applications: one for polony detection and another for vesicle tracking and dynamics identification. | ||
| Xinghua Shi | ||
| Title: System and Tools to Support a Bayesian Approach to Improving Large-Scale Metabolic Models | ||

