|Title||Monitoring Expert System Performance Using Continuous User Feedback|
|Author(s)||Michael G. Kahn, MD, PhD; Sherry A. Steib; William Claiborne Dunagan, MD; Victoria J. Fraser, MD|
|Source||Journal of the American Medical Informatics Association, Vol. 3, No. 3, Pages 216-223|
|Publication Date||May/June, 1996|
|Abstract||Objective: To evaluate the applicability of metrics collected during routine use to monitor the performance of a deployed system. |
Methods: Two extensive formal evaluations of the GermWatcher (Washington University School of Medicine) expert system were performed approximately 6 months apart. Deficiencies noted during the first evaluation were corrected via a series of interim changes to the expert system rules, even though the expert system was in routine use. As part of their daily work routine, infection control nurses reviewed expert system output and changed the output results with which they disagreed. The rate of nurse disagreement with expert system output was used as an indirect or surrogate metric of expert system performance between formal evaluations. The results of the second evaluation were used to validate the disagreement rate as an indirect performance measure. Based on continued monitoring of user feedback, expert system changes incorporated after the second formal evaluation have resulted in additional improvements in performance.
Results: The rate of nurse disagreement with GermWatcher output decreased consistently after each change to the program. The second formal evaluation confirmed a marked improvement in the program's performance, justifying the use of the nurses' disagreement rate as an indirect performance metric.
Conclusions: Metrics collected during the routine use of the GermWatcher expert system can be used to monitor the performance of the expert system. The impact of improvements to the program can be followed using continuous user feedback without requiring extensive formal evaluations after each prediction. When possible, the design of an expert system should incorporate measures of system performance that can be collected and monitored during the routine use of the system.