By Deborah H. Rulo

[The author is a graduate of the University of Missouri-Kansas City School of Law. This is an abstract and a condensed version of a research paper written by the author in partial fulfillment of the research and writing component required for successful completion of the J.D. program at the University of Missouri--Kansas City. April, 1999. The selected portions of this article explain the nature of the test and expresses an opinion upon the reliability/validity of the results obtained by clinicians. The extensive footnotes contained in the research paper have been omitted.]

John Smith is a defendant who has just pleaded guilty to the federal offense of possessing child pornography. John is awaiting his sentencing hearing when he is told that the court has ordered him to take a test. John is told that in the administration of this test a device will be attached by a clinician, to his penis, to monitor its responses to various graphic images to which he will be exposed. The results of this test, John is told, will be used as evidence in his sentencing hearing and later as a condition of his release, to make predictions about his future as a sexual deviant and monitor his rehabilitative progress [Hypothetical adapted from the facts in United States v. Wilson, 1998 WL 939987 (6th Cir. 1998).] 


The test described in the above hypothetical is called Penile Plethysmography. The Plethysmograph measures penile tumescence in response to laboratory stimuli. Results of the test are used to diagnose, evaluate, and treat sexual deviance. It has become a useful, though controversial, scientific and legal tool.

For what purposes is the test used? What is "measured"?

Penile plethysmography is a test utilized to make subjective determinations about an individual's psychological condition with regard to certain sexual disorders and deviant behavior in the male. The objective function of the test is to measure physical sexual arousal. This is based on small, nearly undetectable to the subject, increases in blood flow. In that sense, the test assumes that there exists a natural operational relationship between specific stimuli, penile response, and specific overt sexual acts.

There are two major methods for measuring erectile responses. Measuring the change in penile volume is one method; monitoring the change in penile circumference is the other. The penile circumference measure has become the most utilized and accepted method within the psychiatric and psychological communities. The actual measure of this responsive change in circumference is measured by a "mercury stain gauge" or Barlow gauge. The gauge is fitted around the penis after being examined to ensure proper functioning. Once in place, the gauge is calibrated to determine the subject's baseline tumescence level. Once the calibration process is completed, the actual testing begins. The subject is presented with stimulus material, both auditory and visual, and the increase in circumference is recorded as it correlates to certain varieties of suggestive stimuli.

Selection of stimulus material is a crucial variable

The nature of the stimulus selected for use in the procedure is a variable to that is particular importance if one is to obtain meaningful results. This is the aspect of the testing procedure that differentiates "inappropriate" arousal from "normal" arousal. Clinicians select or create their own stimulus materials; no standard set of stimuli has been compiled. Subjects are exposed to as many as twelve different potentially sexually stimulating fantasies. A complete set of twelve audio or visual presentations might consist of six consenting and six non-consenting scenarios. The sets are designed to include situations depicting children, adolescents and adults of both sexes. 

Scoring and interpreting test results 

Just as crucial to obtaining a meaningful result when administering penile plethysmography is the scoring of the responses that lead to conclusory inferences regarding the subject's sexual deviancy. One very commonly used method of scoring provides a "ratio index of deviant arousal." This ratio index is calculated by "dividing arousal to inappropriate material (e.g., depictions of rape) by arousal to appropriate materials (e.g., depictions of consenting sex between adults)." Child molesters and would-be rapists are differentiated from “non-sex offenders” by their penile responses to the relevant inappropriate material.

The technology of penile plethysmography is not new; it has a history dating back approximately eighty years. While used for various purposes in the past, today it is used primarily to uncover sexual patterns and preferences, which findings are then used to make predictive determinations about overt, manifest behavior. Forensically, it has been used primarily in criminal sexual offender cases and in some family court cases as well. It is used as a basis for establishing pedophilic profiles, and also frequently as a defense tool to support a defense position that the individual lacks the indicia of a particular profile.

Validity/Reliability of the test

If use of penile plethysmography is to be translated into the admission of expert witness testimony in criminal cases, the reliability and validity of its use is at the heart of the admissibility issue.

It is fairly well settled that penile plethysmography is the most reliable means of measuring actual change in penis circumference and sexual arousal. Published research materials attest to the reliability of plethysmography as an indicator of arousal, though the scores obtained show a wide range of reliability coefficients. One very important area or reliability (particularly when the Plethysmograph is used for predicting behavior) that has not been widely studied is the temporal stability of response patterns. The pattern must remain substantially the same over time, indicating stability, to be a useful basis for predicting future behavior. The research on this temporal stability has been sparse. Although researchers and clinicians seem to be evenly split on their view of the reliability of penile plethysmography, the dispute seems to center less on the role that the technique should play in the assessment of sex offenders and more on a simple contention that limitations do exist which warrant caution in its use.

When the scientific validity of a measure such as penile plethysmography is being analyzed, the specific application for which the measure is used is a crucial consideration. There is no question but that the validity is very high when it comes to detecting male arousal. The question is how this result relates to predicting future deviant behavior. One common approach taken by researchers has been to differentiate the results obtained from rapists versus non-rapists. Several studies concluded that there is a statistically significant correlation between the amount of violence shown in the stimuli and the increased response rates from the known rapists. One study examined the distinctions in responsiveness for validity between groups of sex offenders, non-sexual attackers, and a control group of "normal heterosexuals." This study concluded that "groups did not differ on the consenting heterosexual activity film, but typical heterosexual and non-sexual attackers of women showed significantly less response to the film of girl abduction . . . and rape . . . than did groups of sex offenders generally." [Hinton, O’Neill & Webster, Psychophysiological Assessment of Sex Offenders in Security Hospital, 9 Archives of Sexual Behavior 205-216 (1980).]

Recent similar studies appear to have also shown validity in the use of penile plethysmography to detect child molesters. However, in many of these studies at least ten percent of the subjects did not become aroused to any stimuli, indicating either a risk of subject ability to manipulate results or a flaw in the assumption that overt sexual or violent behavior is causally linked to arousal. Furthermore, other researchers have found low external validity. These studies have shown weak correlations between "phallometric indices" and overt sex behavior.

Specific areas of concern about validity 

The questions and disputes surrounding the scientific reliability and validity of penile plethysmography results seem to stem from three specific areas of concern within the behavioral science community. These areas are: (1) lack of standardization; (2) susceptibility of the test to manipulation or feigning; and (3) a concern that the inference that arousal is an excessive extension of the present understanding of arousal and deviant behavior.

1. LACK OF STANDARDIZATION

The first concern is with the lack of standardization. Most widely used psychometric tests have undergone standardization in use patterns within the population norms selected for their projected uses. The Plethysmograph lacks such standardization. Manufacturers have not published administrative and interpretative standards and guidelines that are consistently followed in the use of the instrument. The problem of a lack of standardization affects not only the administration of the procedure but also, and more importantly, the interpretation of the subsequent results.

2. SUSCEPTIBILITY TO MANIPULATION

The second objection or concern that has been articulated by some members in the scientific community stems from the susceptibility of the test to manipulation by the subject. Some researchers suggest that a man is capable of controlling erectile responsiveness to otherwise sexually arousing or provocative stimuli. Perhaps it should be remarked here, though, that the penile Plethysmograph can detect increases in tumescence so minute that they are undetectable to the subject, which is seen by other scientists as eliminating the manipulation concern.

Critics also argue that there is no accurate way of detecting or measuring "response suppression or false positive" outcomes. Erectile responses are not simply reflexive reactions; they can be influenced by factors internal to the subject, such as embellishing fantasy. Voluntary control over reaction to stimuli involves "both cognitive and mechanical processes." Cognitive suppression usually includes either elaborations of stimuli presented to increase tumescence when exposed to "normal" imagery or, conversely, inattention to stimuli that are perceived as "deviant" to decrease tumescence. Although some techniques have been created to minimize the likelihood or possibility of feigning responses, there remains no generally accepted procedure for either controlling, detecting, or estimating when or how frequently it occurs.

3. THE INFERENCE THAT AROUSAL IS LINKED TO DEVIANCY

The final area of penile plethysmography testing that is ardently disputed and which relates most directly to validity, is the inference that penile arousal is linked to deviancy that will manifest or has manifested itself in certain overt behaviors. In essence, the concern is that the results of the penile Plethysmograph test cannot be accurately translated into any useful, predictive information. Many researchers believe that in some cases inappropriate or deviant sexual behavior is wholly independent of arousal or genital responsiveness. For example, "the vast majority of individuals who commit sexual offenses against children are not sexually aroused by stimulus material involving children; 'their primary sexual orientation is to adults and they molest children by fantasizing that they are engaging in relationships with appropriate sex partners.'" [Quoting prosecution expert from trial record in State v.Spencer, 459 S.E.2d 812, 815 (N.C.App. 1995).]

Conclusion

There is some indication from data collected that, "contrary to the notion of stimulus specific responding, arousal among sex offenders occurs in a generalized manner." [Simon & Scharten, Plethysmography in the Assessment and Treatment of Sexual Deviancy: An Overview, Vol. 20, No. 1, A. Sex. Behav. 76, 85 (1991).] This may indicate that even though at times the correlation produced in validity tests was very high and though the test does accurately measure arousal, its function is little more than a "yard stick" due to the limitations some would place on the predictive meaning of those results in terms of overt sexual aggressive behavior.

While penile plethysmography has been studied using scientific methodology by practitioners and clinicians who find its predictive and forensic value to be well settled and accurate, the disparity within the community regarding the reliability and validity of the method when used forensically undoubtedly predicts a rocky road in the courtroom for its proponents when admissibility of predictive opinions on future sexual dangerousness is at issue.

-------

The opinions expressed herein are those of the author. Publication indicates neither approval nor disapproval of these opinions. Editor.


Additional Article in Behavioral Evidence.......

Is Expert Testimony On Eyewitness Reliability Admissible?
Attacking The Invisible: Tools For Preventing The Admission of "Recovered Memory" Evidence At Trial
The Insanity Defense - A Constitutional Right?
Mental Retardation and the Death Penalty
Dissociative Identity Disorder (Multiple Personalities) And Criminal Insanity
"Brain Fingerprinting" - Is It A Reliable Tool?
"Brain Fingerprinting" - Is It A Reliable Tool? Addendum
Brain Fingerprinting Fails First Court Test in Iowa Revised 4/10/01