Todd Leen's Home page


Todd K. Leen, Professor
Department of Biomedical Engineering
Phone: 503 748-1160
Fax: 503 748-1306
Curriculum vitae
Return to BME Faculty Profile

Research Interests

My research in machine learning includes theory and algorithm synthesis, with applications to signal processing, fault detection and prediction in regimes from health care to complex environmental systems.

Starting August 12, 2012, Todd Leen will be at the National Science Foundation serving as Program Director for Robust Intelligence in the Information and Intelligent Systems Division within the Computer Science and Engineering Directorate.

Robustly Detecting Clinical Laboratory Errors

Funded by NSF
Todd Leen, Deniz Erdogmus (Co-PI), Steven Kazmierczac (acting PI)

Hospital clinical laboratory tests are a major source of medical information used to diagnose, treat, and monitor patients. Such test errors lead to delays, additional expense, clinical evaluation and sometimes to erroneous treatments that increase risk to patients. Such errors compromise clinical utility, cost effectiveness and patient safety. One recent study suggests that errors in measured total blood calcium concentration due to instrument mis-calibration alone cost from $60M to $199M annually in the US.; as noted below, the bulk of errors do not originate in instrument mis-calibration.

Clinical laboratory errors affect about 0.5% of samples collected. Of those, approximately 75% of clinical laboratory test errors originate during sample collection, transport, and storage — jointly called the pre-analytic phase — before samples reach the analysis instruments. However the quality control measures standard in hospital clinical test labs only monitor instrument calibration to fiducial test materials. They are therefore completely blind to sample faults introduced in the pre-analytic phase, where most errors originate.

Data derived from patient samples, rather than instrumentation calibration checks, holds the key to detect faults introduced in the pre-analytic phase. Attempts to date to use such information are primitive and grossly insufficient. Current methods are either so insensitive to errors that they do not detect sample faults reliably, or they routinely flag normal samples as being faulty.

This project develops and uses statistical machine learning technology to reliably detect errors in hospital clinical laboratory tests, using data derived from patient samples. In a preliminary study, the PI showed that multi-variate statistical models of lab tests revealed errors that existing techniques missed. The primary obstacle to developing reliable statistical detectors for lab errors is the cost of labeling samples combined with the low error rate. Developing and evaluating any automated error-detection algorithm requires a sufficient number of samples, both faulty and non-faulty. Determining which tests are faulty requires review of the tests and other patient data (e.g. charts) by a clinical lab expert— a time-consuming and economically unfeasible prospect given the low fault rate. The project addresses this challenge through active learning paradigms used to select, with emphasis on rare classes, subsets of the data for labeling by human experts. The project focuses on chronic kidney disease because of its medical importance and large data repository at the PI’s institution. This research will provide algorithms for clinical lab error detection that will extend to tests used in other disease entities (for example diabetes and heart failure).

Ultimately, the error-detection algorithms developed from this research will make their way into clinical laboratory information systems and further into commercialization and thus deployment on a scale significant enough to have widespread positive impact on laboratory costs patient risk.

Stochastic Learning Dynamics

Click here for Software package

Funded by NSF
Todd Leen

The discovery that synaptic plasticity is mediated by processes sensitive to the precise relative timing of pre- and post-synaptic events overturned models of synaptic change based on average activity levels (so-called rate-dependent models). The discovery of Spike-Timing-Dependent Plasticity (STDP) requires new theoretical tools for its description.

Individual STDP events have inherent random variability as well as variability from timing fluctuations due to circuit-level random factors. So computational synaptic dynamics in the new paradigm must be based in the theory of stochastic processes. Previous work modeling the stochastic dynamics STDP typically used the nonlinear Fokker-Planck equation (FPE) to approximate the intractable master equation governing the dynamics. Although often useful, the FPE is known to be deeply flawed and potentially misleading. The situation recalls the use of the FPE by machine learning theorists in the early to mid 1990s; the dynamics of both STDP and on-line, machine learning algorithms follow a Markov process described by a master equation. This project establishes rigorous tools for treating the stochastic dynamics of learning systems based on spike-timing-dependent synaptic plasticity. It develops well-grounded approximation techniques (and exact solutions where available) for probability distributions on the synaptic weights and their moments, and applies the new techniques to synaptic dynamics in natural and artificial learning systems. The new methods are compared to the FPE used in recent literature to provide insight into the accuracy and appropriateness of the various methods. The techniques are relevant not only to computational neuroscience and machine learning, but more broadly to regimes with Markov dynamics are described by a master equation --- potentially including state estimation, and the chemical master equation. The project provides software to the research community for computing distributions and moments using the new methods.

Computational Neuroscience

Dr. Pat Roberts (OHSU), Prof. Nathan Sawtell (Columbia University) and myself have a NIH/NSF Collaborative Research in Computational Neuroscience project on sensory-motor processing and memory in the mormyrid weakly electric fish. The fish have an electro-location system that uses the animal's electric organ discharge (EOD) to navigate, identify objects, and find prey. The electrosensory lateral line lobe (ELL) of the mormyrid brain integrates motor command, proprioceptive, and electrosensory information in a cerebellar-like structure. As part of its function, the ELL generates memories comprising the expected sensory signal from the fish's own electric discharge. These memories are adapted over time through spike-timing-dependent plasticity (STDP). The project integrates modeling and neurophysiology experiments to determine how realistic patterns of excitation are processed in ELL, and how plasticity is controlled by recurrent connections from higher centers. As part of the project, we are developing a novel computer-controlled stimulus system that provides precise control of the spatio-temporal profile of the electric images on the fish's skin.

Health Care Applications of Machine Learning

I collaborate with the OHSU Point of Care Laboratory (POCL) and with Jeff Kaye director of OHSU's Layton Aging & Alzheimer's Disease Center. My work with these colleagues is aimed at detecting behavioral changes that are predictive of emerging health problems, particularly cognitive decline. This work makes use of a number of novel unobstrusive in-home monitoring technologies to provide early detection of health-related changes.

Environmental Observation and Forecasting Systems

I've enjoyed a collaboration with Antonio Baptista and the OHSU Institute of Environmental Health NSF-STC Center for Coastal Margin Observation and Prediction. Our work on the CORIE project has improved reliability of measurements and modeling in the Columbia River estuary. We developed and deployed a system to detect biofouling of salinity sensors deployed in the estuary that cut data loss in half. We have applied learning technology as key elements in a (problem-portable) data assimilation (Bayesian model / data fusion) system. Ours is the first data assimilation system to operate successfully in a strongly non-linear river-estuarine-ocean system. Our novel model surrogates, trained to emulate the dynamics of extremely large (10^7 degrees of freedom) finite element hydronamics models, are a critical enabling technology for this work. The surrogates vastly accelerate the forward model evaluation by factors of one to twelve thousand, enabling a dramatic increase in ensemble prediction capability.

Selected Publications