My Research
Robi Polikar
Overview
My research areas cover the interdisciplinary fields of computational models of learning,
computational and artificial intelligence,
pattern recognition and signal processing with particular applications in
biomedical engineering and non-destructive evaluation.
One of my goals is to establish
Signal Processing and Pattern Recognition Laboratory (SPPRL)
as a national model of such a laboratory at a primarily undergraduate institution.
This model uses the Rowan Engineering’s hallmark of Engineering Clinics,
where undergraduate and graduate students work together.
If you would like to sponsor any work at this laboratory,
and/or have a problem in mind that we might be able to help solve, please see
Support SPPRL
pages.
Initial funding for SPPRL is provided by
National Science Foundation
-
Electrical and Communications Systems Division,
Power, Controls and Adaptive Systems program
through the CAREER program, under grant number
ECS 0239090.
This work led to development of now-highly cited incremental learning algorithms,
such as Learn++ and Learn++.NC (for incrementally learning new concept classes).
A subsequent grant,
ECCS 0926159
funded the recent work on learning in nonstationary environments,
and led to development of the Learn++.NSE algorithm.
The recently funded current grant,
ECCS 1310496,
is being used to address one of the most challenging problems in machine learning: learning from streaming data,
where limited labeled training data is available only at an initial state,
followed by unlabeled data, whose distribution may be changing and drifting.
We call this “initially labeled streaming nonstationary environment,
and we are developing the algorithm COMPOSE (Compacted Object Sample Extraction)
Here are some specific projects I am currently working on or have recently worked on.
A
List of Publications
and a
list of recent and active grants
are also available, where you may download relevant papers.
Machine Learning, Computational Intelligence & Learning from Data:
Incremental Learning, Nonstationary Learning,
Data Fusion and The Missing Feature Problem(1)
One of the more recent and remarkable areas of machine learning and computational intelligence research has been,
ensemble systems, which have been shown to have desirable properties over single classifier systems.
Ensemble systems minimize our chance of choosing a poor classifier,
and in general provide better generalization performance than single classifier based systems.
They are typically more robust, resistant to over-fitting problems,
and can be used to address several different types of problems,
such as incremental learning, concept drift,
confidence estimation, data fusion and the missing feature problem.
We have developed the Learn++ family of algorithms to use ensemble systems to address these problems
and have found out that they are remarkably robust
and perform very well under a broad spectrum of environments and applications.
The original Learn++ for incremental learning of new data,
Learn++.NC for incremental learning of new concept classes,
Learn++.MF for accommodating missing features and missing data,
and Learn++.DF for data fusion.
More recently, we have started studying more challenging problems in machine learning,
namely learning under nonstationary and drifting environments.
Most machine learning approaches make the fundamental assumption that data are independent and identically distributed,
that is, all data come from the same fixed, yet unknown distribution.
This is clearly not the case in many real world applications,
such as financial data, climate data, energy demand and pricing data, spam detection,
malware detection, among many others.
In such cases, data distribution change over time,
and the classification model must track and adapt to changes in the underlying distributions.
To address these challenges, we have recently developed Learn++.NSE algorithm.
A further challenge is when such nonstationary data is also imbalanced,
where one (or more) classes are severely underrepresented in the environment.
This problem was addressed through Learn++.CDS (Concept Drift with SMOTE)
and Learn++.NIE (Nonstationary and Imbalanced Environment) algorithms.
Currently, under a new grant from NSF, we are investigating even a more challenging problem,
where a streaming and nonstationary environment provides only unlabeled data (after a small initial training set).
We are working on an algorithm called COMPOSE (Compacted Object Sample Extraction),
whose preliminary analysis provide very promising results.
Machine Learning Research at SPPRL
For more information on use of ensemble systems and Learn++ algorithms on incremental learning,
concept drift, missing feature, data fusion, and confidence estimation,
as well as COMPOSE on initially labeled streaming nonstationary environments,
please visit the SPPRL
Tutorials
An overall review of various pattern recognition techniques
as well as individual components of a complete pattern recognition system.
Published in the Wiley Encyclopedia of Biomedical Engineering.
A very popular tutorial of ensemble systems,
including step—by—step algorithms for many commonly used
and popular algorithms and novel applications of ensemble systems.
Published in IEEE Circuits & Systems Magazine—vol.6, no. 3, 2006.
(most heavily cited paper in this magazine)
A tutorial on how ensemble systems can be used for incremental learning,
data fusion, and missing feature analysis.
Published by IEEE Signal Proc. Mag., vol. 24, no. 4, 2007
Other Research
Computational Approaches for Early Diagnosis of Alzheimer’s Disease(2)
This was my first research project: the one for which I wrote my first paper,
attended my first conference, and wrote my M.S. thesis.
Considering that Alzheimer’s disease (AD)
is one of the most common neurological disorders among elderly people,
and that there is no proven method for diagnosis or cure,
this project certainly has a very special meaning for me.
The idea is analyzing evoked potentials of the EEG through multiresolution wavelet analysis
and using a neural network to classify them as normal or AD.
The initial results have been promising,
and therefore, in collaboration with
University of Pennsylvania’s
Memory Disorders Clinic—Alzheimer’s Disease Center,
I am currently working on extending this work to a larger cohort of patients.
We are also looking at the ability of ensemble based systems
to provide a natural mechanism for heterogeneous data fusion of EEG,
MRI and PET data for early diagnosis.
7 Level wavelet decomposition of event related potentials
Alzheimer patient
Control
x
Non-Destructive Evaluation
Metal pieces joined together by welding grow cracks around the welding region due to various stress factors. For example, stainless steel pipes that are used in nuclear power plants grow cracks around welding regions due to heavy fluid flow in them, temparature changes, etc. These cracks, if not detected in time, can cause the radioactive material to leak and hence result in devastating consequences. Ultrasonic waves are commonly employed to detect these flaws long before they cause any significant damage. The idea is to launch an ultrasonic wave into the material and analyze the signal that is received. If there is any discontinuity within the material, such as a crack, the ultrasonic wave is reflected from the discontinuity, and received through an ultrasonic transducer.Since no damage is caused to the material, this process is called non-destructive evaluation of the sample. Unfortunately, cracks are not the only type of reflectors that can be found in such a pipe. Other reflectors include counterbores, weld roots, slags, porosity and lack of fusion, all of which are results of the welding process, and do not necessarily constitute any immediate danger to the life cycle of the material. The reflections received from these indicators look very much like those of cracks, and the challenge is to identify the ones that are actually coming from cracks. Similar applications include detecting and identifying various types of flaws / defects in gas transmission pipelines, aircraft, submarine or engine parts, etc., whereas other measurement modalities include eddy current, magnetic flux leakage, thermal imaging, etc. The NDE-Lab at Rowan is equipped with various test equipment to make a wide variety of NDE related measurements. A current joint project between the NDE Lab and the SPPRL is funded by the DOE and is involved with developing data fusion algorithms for combining information coming from several different measurement modalities for better material characterization. The NDE-Lab also includes a state-of-the-art virtual reality system for advanced visualization, funded by the NSF.
Integration of Novel Content (BME) into Existing Curriculum (ECE)(3)
A time honored technique for introducing students to new and emerging topics is to offer electives; however, there are a few major drawbacks to this approach: the topic must be very focused, either depth or breath must be sacrificed, and in either case, only a very limited amount of material can be covered, and students who may not have prior background about the topic often hesitate in electing a course in which they may very well find interest. Furthermore, as the number of credits required to obtain a BS degree decline over the years due to market pressures, so do the number of electives offered.
Against this background, we propose another time-honored technique, but this time in a new setting, as a paradigm specifically designed for integrating novel content material into existing curriculum: develop new laboratory exercises tailored to provide content specific knowledge that relate to the focus areas of existing courses. In our implementation, we use biomedical engineering (BME) as the novel content and the electrical and computer engineering (ECE) as the core curriculum, with two primary objectives: to provide ECE students with fundamental and contemporary BME knowledge for future career and graduate study opportunities; and to improve students’ interest in and comprehension of ECE concepts by acquainting them with engineering solutions to real world problems in medicine. This approach has several advantages: (1) it is versatile, any number of topics can be integrated that the faculty deems important; (2) a broad spectrum of topics can be addressed as they are distributed throughout the 4-year curriculum, (3) all students are exposed to novel content; (4) very little additional resources are required for implementation; (5) students receive a more well-rounded and broad education within their specific disciplines; (6) experiments are integrated into existing courses, keeping credit count unchanged; (7) electives can then be devoted to covering depth in specific issues, and students will be able to make better informed decisions about choosing related electives.
For a complete list of our works please see
List of Publications
For more information on what SPPRL can do for your company / institution,
please contact Dr. Robi Polikar at:
214 Rowan Hall, Dept. of Electrical and Computer Engineering
Rowan University, 201 Mullica Hill Road, Glassboro, NJ 08018
Phone: (856)256-5372
Fax: (856)256-5241
polikar@rowan.edu