INTRODUCTION
In 2004, the Department of Energy (DOE) established the Human Reliability Program (HRP) to ensure that individuals meet the highest standards of reliability, including physical and mental suitability before allowing them to a) perform hands on work with a type of special nuclear material (SNM I), nuclear weapons, and nuclear weapon components, b) protect and/or transport SNM I, weapon components, and fully assembled weapons, and/or c) have specialized and therefore sensitive knowledge of SNM I, weapon components, and fully assembled weapons. The program was codified in the Code of Federal Regulations (10 CFR 712).
The HRP is implemented by DOE and its semiautonomous agency, the National Nuclear Security Agency (NNSA), to mitigate personnel-related safety and security risk at its nuclear weapons facilities. To this purpose, HRP requires candidates and incumbents to submit to annual psychological and medical evaluations as well as to personnel security and supervisory reviews. The process is designed to detect potential vulnerabilities that could impact participants’ ability to execute duties in a safe and secure manner (Department of Energy, 2018).
While the implementation of the Human Reliability Program (HRP) pre-dates the establishment of DOE’s Insider Threat Program (ITP; DOE, 2014), it clearly falls under the umbrella of ITP. In fact, the U.S. Government Accountability Office (GAO) references HRP as an insider threat measure at nuclear weapons facilities in its recent review of DOE’s Insider Threat Program (Bawden, 2023). In this report, GAO reiterates HRP’s intent “to ensure that only individuals who meet the highest standards of reliability and physical and mental suitability have access to certain materials, nuclear explosive devices, and facilities.” (Bawden, 2023, p. 6) Indeed, given its primary objective of identifying and mitigating risk posed by employees and contractors, HRP is an integral part of DOE’s efforts to fully integrate and implement its ITP. The fact that HRP-certified individuals have direct access to nuclear weapons, weapon parts, SNM I, sensitive nuclear weapons information, and the systems that protect and/or transport these resources, underlines the importance of HRP and its efficacy.
The HRP program is a selection and retention program in that it requires initial and annual medical and psychological screenings to determine its participants’ suitability for certification. In this regard, it is a continuous monitoring program designed to mitigate insider threat. There are a variety of safety or security-sensitive roles that require cognitive, personality, and/or psychopathology assessments, including Special Forces (Farina et al., 2019), law enforcement (Ellingwood et al., 2020), firefighters (Barrett et al., 1999), astronauts (Beven et al., 2018), air traffic controllers (Schutt & Torrence, 2022) and airline pilots (Carretta et al., 2013). Most of the published research has evaluated whether the psychological evaluation predicted selection (e.g., Farina et al., 2019) or successful completion of the required training (e.g., Matton et al., 2013). While these enterprises have, no doubt, attempted to determine whether their respective evaluation process is associated with short- and long-term safety or security risk (or other performance outcomes), circulation of the results of these projects has been limited. Likewise, there have been no published studies of the HRP psychological evaluation process. As such, there is little direct empirical support concerning the prediction of future risk-related behavior or outcomes in the context of HRP or similar agencies.
Nonetheless, there are relevant literatures which have informed the HRP evaluation process. For example, the study of assessment and selection for military special operation forces has demonstrated cognitive testing (Farina et al., 2019; Schmidt, 2014), psychological measures of personality attributes (Farina et al., 2019; Gucciardi et al., 2021), and structured interviews (Picano & Roland, 2012) are predictive of suitability for these elite positions. Furthermore, personality assessment has been found to predict future counterproductive work behavior (CWB, Anglim et al., 2018). Other relevant literatures include risk factors for workplace violence (Geck et al., 2017) and the link between cognitive deficits and workplace competency (Korinek et al., 2009). Taken together, these studies provide theoretical support for the current elements of the HRP psychological evaluation in predicting future job performance and/or misbehavior. Ultimately, the goals of HRP and other counter-insider threat programs across the intelligence community are the same, to deter, detect, and mitigate harmful acts by insiders (Staal & Harvey, 2019).
To be scientifically “valid,” the HRP psychological evaluation process should be found to accomplish what it purports to do – identify indicators of elevated safety and security risk. As such, the HRP psychological evaluation elements should differentiate between employees who demonstrate elevated risk versus those who do not. In an effort to determine the efficacy of the HRP psychological evaluation process in accomplishing this task, the PERIL project identified HRP participants whose psychological evaluation elements should reflect elevated risk. For example, individuals who show forms of workplace misbehavior (e.g., rules violations) would be expected to show elevated risk indicators in the evaluation(s) that preceded the misconduct. If proven, this would demonstrate predictive validity for this aspect of the HRP process. Likewise, there are some HRP participants whose evaluations lead to additional assessment (e.g., further testing, external referrals, outside mental health records review). The assessment records of individuals who receive this sort of extended evaluation, despite eventual approval, should demonstrate evidence of elevated risk. If shown to be true, this would support the concurrent validity of the psychological evaluation process by demonstrating that requiring HRP participants to complete additional assessments is not random, but rather a result of a psychologist’s recognition of risk indicators that warrant the extra effort.
As defined in the DOE Order establishing its Insider Threat Program, “‘Insider Threat’ means the threat that an insider will use his/her authorized access, wittingly or unwittingly, to do harm to the security of the United States” (2014, p. 12). Given the fact that those certified in the HRP have direct access to assets of the U.S. nuclear weapons enterprise, the risk of insider threat activities is particularly grave. As such, determining risk indicators for HRP certified individuals is important.
DOE’s HRP regulation sets forth certain risk indicators in the form of reliability criteria (10 CFR 712.13(c)(1-13)). These include “Psychological or physical disorders that impair performance of assigned duties,” “Conduct that warrants referral for a criminal investigation or results in arrest or conviction,” “Indicators of deceitful or delinquent behavior,” “Attempted or threatened destruction of property or life,” and “Failure to comply with work directives, hostility or aggression toward fellow workers or authority, uncontrolled anger, violation of safety or security procedures, or repeated absenteeism” among others (2018, p. 9). One of the challenges of conducting research regarding HRP outcomes is that such occurrences (i.e., insider threat behaviors) are relatively infrequent.
Whereas the HRP psychological evaluative process produces voluminous information regarding the individual participant and, when aggregated, about the HRP-certified population in general, the requirements of reliable statistical analyses necessarily include limiting the predictors variables relative to the size of those exhibiting specific categories of behaviors (i.e., criterion groups). This means that PERIL criterion groups had to be carefully selected.
The PERIL research team included seven clinical psychologists with decades of professional experience. Three of these psychologists were DOE HRP Designated Psychologists. A fourth psychologist was formerly an HRP Designated Psychologist. The fifth and sixth psychologist were university professors and experienced researchers who were intimately familiar with HRP. Additionally, four members of the research team were extensively involved in the pilot phase of this project (Reynolds et al., 2015), which significantly informed and improved the current project. Thus, based on prior experience and after extensive analysis, the PERIL researchers determined that the most relevant types of risk could be sorted into six distinct criterion groups. These included HRP participants who were:
- approved for the program only after a more extended evaluation and monitoring,
- approved for the program but were later removed due to emergent medical, substance misuse, or other mental health concerns,
- approved for the program but were later involved in one or more Incidents of Security Concern investigations (IOSC),
- approved for the program but were later the subject of an Ethics investigation,
- approved for the program but were later determined to have violated workplace rules, and
- approved for the program but were found to have experienced an Occupational and Safety Health Administration (OSHA) recordable injury or illness.
It was hypothesized that one or more psychological evaluation elements would predict membership in each criterion group. More specifically, we hypothesized that negative events and circumstances in an employee’s history (e.g., legal charges, job terminations, problems with substance use, elevations on MMPI clinical scales) would be associated with criterion group membership (e.g., extended psychological evaluations, rules violations, incidents of security concerns (IOSC), temporary or permanent removal from the program). Accessing large datasets from two DOE/NNSA sites promised to be invaluable in determining the predictive ability of the HRP psychological evaluation process. Analyzing the data gathered would help determine which elements of the HRP evaluation process reliably identify safety or security risk, thereby informing any future revision to the HRP evaluative processes as well as the refinement of the regulation and related implementation guidance.
METHODS
Participants
The data used in this project were based entirely on archival records drawn from the HRP psychological evaluation process and information from the Security, Safety, Ethics, and Human Resources departments at each of the DOE/NNSA participating sites. Thus, there were no direct interventions with participants and no procedures required to monitor for safety or mitigation of risk to participants other than providing for de-identification of the datasets.
The Y-12 sample was comprised of all 3,593 employees who received one or more annual psychological evaluation as HRP candidates or incumbents between January 1, 2013 and December 31, 2019. The full sample had a mean age of 45.7 (SD = 10.9) and was comprised of 86.0% males and 14.0% females. The sample had a mean of 14.2 years of education (SD = 2.1). The race/ethnicity of the sample was 89.1% White and 10.9% Other race/ethnicity. The mean number of years worked at Y-12 for the sample was 12.4 (SD = 9.8).
The Pantex sample was comprised of all 3,114 employees who received one or more annual psychological evaluation as HRP candidates or incumbents between January 1, 2019 and December 31, 2021. The full sample had a mean age of 45.8 (SD = 11.8) and was comprised of 83.3% males and 16.7% females. The sample had a mean of 14.6 years of education (SD = 2.1). The race/ethnicity of the sample was 70.8% White and 29.2% Other race/ethnicity. The mean number of years worked at Pantex for the sample was 13.6 (SD = 11.0).
Pantex and Y-12 are two of DOE’s nuclear weapons sites. Nearly 50% of those certified in the HRP across the enterprise are employed at these facilities. Since 2014, Pantex and Y-12 have been managed under the same management and operations (M&O) contractor, Consolidated Nuclear Security, LLC (CNS). The bulk of the PERIL pilot project (Reynolds et al., 2015) was conducted at Y-12 prior to CNS becoming the M&O contractor for both facilities. The contract change and the fact that Pantex and Y-12 used the same electronic medical record system, afforded the opportunity to include additional data in this study.
For each site, de-identified employee records were coded to determine membership in one or more criterion groups versus the control group (i.e., having a zero (0) on the dependent binary indicator of group membership in all years). Individuals who experienced no adverse safety or security-related events across all annual evaluations during the data collection periods were placed in the control group. The final sample size of the control and criterion groups are displayed in Table 1.
Measures
The HRP regulation requires the use of two primary psychological evaluation instruments. First, a semi-structured interview is required as part of each HRP psychological evaluation. Second, a psychological test is required during an initial/candidate HRP psychological evaluation and triennially thereafter (Department of Energy, 2018). Both Y-12 and Pantex utilize the Electronic Medical and Business Operating System (EMBOS©). The version of the semi-structured interview available in EMBOS is the Structured Interview Survey (SIS).
Structured Interview Survey (SIS): The SIS questionnaire is a semi- structured interview (identical at both sites). Once completed by the HRP candidate or incumbent, the SIS serves as an interview guide for the psychologist who conducts the HRP evaluation. SIS questions (and any necessary follow-up interview questions) gather details from the following areas of psychosocial history:
-
Educational/developmental history
-
Military history
-
Work history
-
Security clearance history
-
Family of origin, marital, and parenting history
-
Current and historic financial circumstances
-
Current and historic alcohol use
-
Substance misuse/abuse history
-
Legal history
-
Prior psychotropic medication use
-
Medication side effects
-
Psychiatric symptoms and history
-
Sleep habits, quality, and difficulties
-
Life history of events that increase the potential for blackmail or coercion (e.g., personal life secrets)
Although the SIS has undergone several revisions since 2004, a core set of questions remained consistent across iterations. Consistent SIS content was identified and coded into the PERIL database. The following set of variables derived from the SIS were used as predictors of membership in the criterion groups.
-
Years worked at Y-12/Pantex
-
Ever diagnosed with a learning disability
-
Ever suspended or expelled from school
-
Problems with impulse control, hyperactivity, or attention
-
History of child abuse or neglect
-
Ever used marijuana
-
Ever used hard drugs
-
Number of arrests or charges (excluding alcohol or public intoxication arrests)
-
Ever taken medication for a mental health condition
-
Ever received any psychological counseling or therapy
-
Ever hospitalized for a mental health or substance use problem
-
Ever thoughts of hurting others
-
Temper ever caused problems
-
History of extra-marital affairs
-
Self-reported job stress
-
Rotating work shift (fluctuating work schedule)
The following composite variables were created to better examine some broad constructs comprised of similar items.
-
Total adverse work events (the sum total of affirmative responses on 14 SIS items, such as reprimanded for conduct, terminated or forced to quit, accused of time card fraud, etc.)
-
Ever have a problem with peers or supervisors (the sum total of affirmative responses on 2 SIS items)
-
History of domestic violence perpetration (the sum total of affirmative responses on 3 SIS items, including history of a restraining order, abuse of a partner, and abuse or neglect of a child)
-
History of alcohol problems (the sum total of affirmative responses on 8 SIS items, such as doctor suggested cutting down, felt guilty about drinking, people complaining about drinking, etc.)
-
Misuse of prescription medications (any affirmative response on SIS items addressing misuse of one’s own medication, abuse or dependence on one’s own medication, and use of someone else’s medication)
-
History of major financial problems (the sum total affirmative responses on 3 SIS items, including history of bankruptcies, foreclosures, or late income tax filing)
-
History of minor financial problems (the sum total of affirmative responses on 4 SIS items, including history of any late bill payments, late child support, wage garnishment, and being sent to collections)
-
Total symptoms of depression (the sum total of affirmative responses on 12 SIS items, such as thoughts of hurting self, feeling down or blue, decreased interest or pleasure, etc.)
DOE authorized the suite of Minnesota Multiphasic Personality Inventory (MMPI; Ben-Porath & Tellegen, 2008) instruments to fulfill the HRP requirement for triennial psychological testing (Department of Energy, 2018). Presently, all DOE/NNSA HRP sites make use of one or another version of the MMPI to meet the psychological test requirement of the regulation. Pantex and Y-12 most often use the MMPI-2-Restructured Form® (MMPI-2-RF). The MMPI-2-RF is a 338-item self-report measure empirically associated with psychiatric diagnosis and personality pathology. Such psychodiagnostic measures have become increasingly integrated into personnel evaluation processes in an effort to assure workplace safety and security (Graham, 2005). PERIL analyses included the following MMPI-2-RF subscales as possible predictor variables of membership in one or more criterion group: Ideas of Persecution (R6), Antisocial Behavior (RC4), Substance Abuse (SUB), Cynicism (RC3), Anxiety, Low Positive Emotions (RC2), Dysfunctional Negative Emotions (RC7), and Aberrant Experiences (RC8).
Procedures
IRB approval for this study was obtained from both the Oak Ridge Site-wide Institutional Review Board (OSIRB) and the University of Tennessee. No records or data with identifying information were used for data analyses. The Y-12 and Pantex databases were assembled and de-identified within the secured server on the Y-12 network (behind “the firewall”). Each case was de-identified by replacing identifying information (e.g., name, date of birth, employee identification numbers) with randomly generated research participant numbers and removing all other potentially identifying information. All working datasets and the de-identification key are stored within the secured server on the Y-12 network. Only the de-identified datasets were exported to the consulting investigators for analysis.
Because HRP participants are required to submit to annual evaluations, each participant could have received multiple annual HRP evaluations – up to seven at Y-12 and up to three at Pantex in the study datasets. Participants who did not experience a criterion group-related event during the data collection period had values of zero (0) on the dependent variable for all indicators. Thus, these individuals were included as members of the control/comparison group for this study.
DATA ANALYTIC PLAN
Prior to conducting data analyses, investigators cleaned data from the SIS. As above, in preparation for each HRP psychological evaluation, participating employees complete their portion of the SIS before being evaluated by the psychologist. Once complete, the psychologist uses the SIS questionnaire as an outline for interviewing the employee. The psychologists ask questions and document explanations for SIS items in notes embedded within the SIS. Many of the questions on the SIS query employees as to whether a variety of items occurred or otherwise apply to the employee being evaluated. At times, employees were inconsistent in recording their answers on the SIS (e.g., the participant indicated that one of the SIS items occurred in multiple previous years, but not in subsequent years). Sometimes the psychologists documented these inconsistencies or errors in their notes. In these cases, the investigators corrected the data imported from the SIS to reflect the assessing psychologist’s conclusion regarding the most accurate information.
As described above, investigators created eight composite variables derived from the SIS that served as predictor variables in the current analyses. While the PERIL dataset was substantial, the count of participants belonging to each criterion group (i.e., having a one (1) on a dependent binary indicator) varied widely. The assumptions of the data analytic approach employed allowed for only one predictor variable for every 15 members of a given criterion group (Harrell, 2016). Thus, in an effort to maximize the breadth of predictor variables, composite variables were created to examine the prevalence or frequency of any of a combination of variables in a similar category (e.g., financial, substance use, domestic violence, depressive symptoms) or a measure of the number of problems endorsed within each category.
Following the completion of data cleaning, the researchers fitted a series of multilevel, logistic regression models, stratified by site (Y-12 or Pantex) (Rabe-Hesketh & Skrondal, 2012). We considered combining both sites into one set of analyses but chose to conduct separate analyses because the data were collected during different time periods for each site (2013-2019 and 2019-2021) and the method and procedures for collecting data was slightly different across sites. Random effects of the individuals were included in these models to account for the fact that repeated binary indicators of criterion group membership available for the same individual over time may be correlated. The predictor variables were coded from data in the SIS, MMPI-2-RF, and EMBOS. We present coefficients (the change in the log-odds of the event occurring) and p- values for Y-12 and Pantex. Independent variables associated with criterion group membership were considered to be statistically significant with a p-value under 0.05. For model parsimony, the year of the assessment was included in the models as a control variable. This assumed that the relationship between time and a given binary indicator was linear. All analyses were conducted using Stata 17.
RESULTS
All multilevel models, except for the “Labor” criterion group at Pantex, and the “Hold” criterion group at Y-12, had a significant likelihood ratio test result for the variance of the random individual effects, meaning there is significant unexplained between-respondent variance in the binary indicators even after controlling for all predictors, and that using multilevel logistic regression (as opposed to standard logistic regression assuming independent observations) was appropriate. In the two instances where the likelihood ratio test was not significant, that means there is no significant unexplained between-respondent variance and there is no difference between using the multilevel logistic regression approach and a standard logistic regression model after including all the predictors.
We note that some outcomes (e.g., being accused of an ethics violation at Pantex) were measured, but due to low frequency of criterion group membership, were excluded from analyses.
Across all multilevel logistic regression models for Y-12 and Pantex, except for the “Hold” criterion group at Y-12, an increase in one year was associated with a decrease in the log- odds of criterion membership, adjusting for other predictors. That said, an increase in time by one year was associated with an increase in the log-odds of having a hold on approval for the Human Reliability Program measured at Y-12 controlling for other predictors.
Criterion Group A – Having a hold on approval for the Human Reliability Program
For individuals at Y-12, having an anxiety or mood disorder or ever having a problem with a peer or supervisor were associated with higher log-odds of having a hold or otherwise extended HRP evaluation after accounting for the other predictor variables. An increase of one in the total number of adverse workplace events, the total number of major financial problems, the total number of alcohol problems, or the total number of depression symptoms was associated with an increase in the log-odds of having a hold on entry into the HRP after controlling for other predictors (see Table 2).
No MMPI variables were significantly associated with having a hold on approval for the HRP among individuals at Y-12. There are no results for Pantex individuals as there were not enough occurrences of individuals having a hold on approval for HRP to conduct meaningful analyses.
Criterion Group B – Approved for HRP but later removed due to emergent medical, substance misuse, or other mental health concerns
Multiple predictors were found to be significantly associated with the log-odds of being placed in HRP but later removed from HRP at Y-12 (see Table 3). An increase of one in the total number of adverse workplace events is associated with an increase in the log-odds of being placed in HRP but later removed from HRP for an emergent issue at Y-12 controlling for the other predictors in the model. The same is true for an increase in the total number of alcohol problems and the total number of depression symptoms, as an increase of one problem in either is associated with an increase in the log-odds of being placed in HRP but later removed from HRP at Y-12 controlling for the other predictors in the model.
Receiving any psychological help compared to not receiving any psychological help was associated with a decrease in the log-odds of being placed in HRP but later removed from HRP at Y-12 controlling for the other predictors in the model, which may seem contrary to expectations. On the other hand, it may be that individuals who seek behavioral health support are not only exercising self-care but also mitigating risk of future adverse safety or security- related events. A review of the bivariate distribution of being placed in HRP but later removed from HRP at Y-12 and receiving any psychological help (see Table 4) shows that 80 out of 6,754, or 1.2% of individuals who reported receiving psychological help during their HRP evaluations were placed into HRP and later removed compared to 170 of 13,542, or 1.3% of individuals who did not report receiving any psychological help during their HRP evaluations but were placed into HRP and later removed. No MMPI variables were significantly associated with being placed in HRP but later removed from HRP among individuals at Y-12.
Among individuals at Pantex, an increase by one in the number of adverse workplace events or the number of arrests or charges an individual is associated with increased log-odds of being placed in HRP but later removed from HRP, after adjusting for the other predictors in the model. Ever taking medicine to help with psychological problems increased the log-odds of being placed in HRP but later removed from HRP, after adjusting for other predictors. As noted in the analytic approach section, MMPI variables were only assessed among individuals at Y-12.
Criterion Group C – Approved for HRP but later involved in one or more Incidents of Security Concern Investigations (IOSC)
Ever experiencing child abuse or neglect was found to increase the log-odds of having an IOSC for individuals at Y-12, adjusting for the other predictors in the model. Being on a rotating shift compared to not being on a rotating shift at Y-12 was associated with a decrease in the log- odds of having an Incident of Security Concern (IOSC), which was contrary to our expectations (see Table 5).
We reviewed the bivariate relationship between having an IOSC and being on a rotating shift (see Table 6) and found that 70 of 3,700 or 1.9% of individuals who were on a rotating shift had an IOSC compared to 460 of 16,591 or 2.8% of individuals who were not on a rotating and had an IOSC. No MMPI variables were significantly associated with having an incident of security concern among individuals at Y-12.
Among individuals at Pantex, having an anxiety or mood disorder, receiving any psychological help, and having higher perceived stress at work each increased the log-odds of an IOSC, adjusting for the other predictors in the model. An increase in the total number of adverse work events by one was also associated with an increase in the log-odds of having an IOSC, taking into account the other variables in the model.
Criterion Group D – Approved for HRP but later accused of an ethics violation resulting in an investigation
Overall, analyses did not find any variable measured to have a relationship with being accused of an ethics violation among individuals at Y-12, after controlling for all other variables in the model (see Table 7). There are no results for Pantex individuals as there were not enough occurrences of individuals being accused of an ethics violation to conduct meaningful analyses.
Criterion Group E – Approved for HRP but later determined to have violated workplace rules
At Y-12, an increase in the total number of adverse workplace events by one was associated with an increase in the log-odds of having a labor and/or discipline concern, controlling for other predictors. As well, ever using any hard drugs (e.g., cocaine, heroin) and ever taking any medicine for psychological needs were also associated with an increase in the log-odds of having a labor and/or discipline concern, after adjusting for other predictors in the model (see Table 8). No MMPI variables were significantly associated with having a labor and/or discipline concern among individuals at Y-12.
Among individuals at Pantex, an increase in the total number of adverse work events, the total number of major financial problems, or the number of arrests or charges by one was associated with an increase in the log-odds of having a labor and/or discipline concern after controlling for the other predictors.
Criterion Group F – Approved for HRP but later found to have experienced an Occupational and Safety Health Administration (OSHA) recordable injury or illness
An increase of the total number of major financial problems by one for individuals a Y- 12 was found to be associated with having higher log-odds of having an OSHA violation controlling for other predictors (see Table 9). An increase of one in the antisocial behavior score on the Minnesota Multiphasic Personality Inventory was associated with an increase in the log- odds of having an OSHA violation among Y-12, after accounting for other predictors. There are no results for Pantex individuals as there were not enough occurrences of individuals experiencing an OSHA recordable injury or illness to conduct meaningful analyses.
Part of the purpose of conducting multilevel logistic regression modeling was to provide a basis for making predictions regarding the probability of future adverse workplace events. For each of the analyses reported previously, the multilevel models could be displayed as mathematical equations to predict the probability of future membership in each criterion group. For illustrative purposes, here we present equations representing the odds of having an OSHA event at Y-12 (omitting the random effects that were included in each of the models, meaning that predictions would correspond to an “average” individual with random effect of zero (0)):
GENERAL EQUATION: logit(p) = ln(p/(1-p))= β0+β1 (Year (Centered at 2016))+β2 (Total Number of Adverse Work Events)+β3 (Total Major Financial Problems)+β4 (Problems with Peer or Supervisor)+β5 (Total Number of Alcohol Problems)+β6 (Total Number of Depression Symptoms)+β7 (MMPI Antisocial Score)+β8 (MMPI Cynicism Score)+β9 (MMPI Anxiety Score)+β10 (MMPI RE2R Score)
NUMERIC EQUATION: logit(p) = ln(p/(1-p)) = -7.81+(-0.22)(Year (Centered at 2016))+0.04(Total Number of Adverse Work Events)+0.58(Total Major Financial Problems)+(-0.13)(Problems with Peer or Supervisor)+(-0.19)(Total Number of Alcohol Problems)+0.09(Total Number of Depression Symptoms)+0.03(MMPI Antisocial Score)+(-0.01)(MMPI Cynicism Score)+0.03(MMPI Anxiety Score)+(- 0.01)(MMPI RE2R Score)
For any individual, data derived from the annual HRP evaluation could be inserted into the equation above to compute the participant’s risk of having an OSHA event. These equations could be used to determine risk of future criterion group membership across all adverse work domains.
LIMITATIONS
The research team set ambitious goals for this project by including refined data streams from relevant sources including Incidents of Security Concern (IOSC), Ethics and Employee Concerns (Ethics), and Labor Relations (Labor). While not included in our study proposal, we were also hopeful about including information from the Office of Personnel and Facility Clearances and Classification’s (OPFCC) personnel security file reviews. Lastly, we sought to include a criterion group representing HRP employees with extended records of exemplary service (i.e., employees who had no adverse safety or security-related event for the duration of their employment).
Although we were able to incorporate the refined data from IOSC, Ethics, and Labor, the challenge of reviewing and codifying data from OPFCC’s personnel security file reviews was not feasible. On the one hand, this level of review might have allowed us to construct a criterion group representing HRP employees with extended records of exemplary service. This would have allowed us to examine potential protective factors that reduce the risk of safety and security issues. Doing so would have required a review of paper records. Given that the master database contained information from over 27,000 HRP psychological evaluations with hundreds of data points per evaluation, the resources required to construct the exemplary service criterion group exceeded the perceived benefit of doing so. Indeed, the challenge of constructing the master database was daunting. Acquiring, reviewing, cleaning, coding, de-identifying, exporting, and analyzing information gathered from 10 years of HRP psychological evaluations across two DOE/NNSA sites produced over 5,000,000 cells of data. This effort took hundreds of hours between five researchers.
We would be remiss if we did not mention that the period from which Pantex data were collected was a unique time in history. In 03-2020, the World Health Organization (WHO) declared COVID-19 a global pandemic. The impact of COVID-19 undoubtedly affected employees during the last two years of the period from which Pantex data were collected. From changes in the manner HRP psychological evaluations were conducted (virtual evaluations in separate on-site locations) to mandatory safety requirements (social distancing, mask requirements, vaccine mandates) and the uncertainty of a novel and deadly virus, Pantex employees were under uniquely stressful circumstances. It stands to reason that the stress of these circumstances had an impact on employee performance and PERIL criterion group behaviors. Without further analyses, quantifying the particular effects of the circumstances cannot be measured.
We did not conduct comparisons between the control group and the criterion groups at each site on age, race/ethnicity, and gender. Future work should address this limitation and extend demographic comparisons to other domains, including education, socioeconomic status, relationship status, etc.
DISCUSSION
The Human Reliability Program (HRP) is a safety and security reliability program created to screen for elevated risk associated with nuclear weapons work. In this regard, it can serve to mitigate passive and active forms of insider threat behavior. Whereas HRP predates DOE’s Insider Threat Program (ITP), it clearly fits within this framework (Bawden, 2023).
HRP’s annual psychological evaluations are a key component of the HRP evaluative process. Clearly, diagnosable mental health conditions have the potential to compromise the safe and secure execution of work with nuclear material and explosives. Likewise, deficits in honesty, integrity, and personal responsibility could have devastating consequences. Security risk occurs in the context of accidental or purposeful compromise of classified or otherwise sensitive information and material. Information gleaned from the HRP psychological evaluation process is part of a larger process of mitigating risk from human factors. PERIL findings should inform the overall decision-making process for HRP certification rather than being the exclusive basis of decision making.
Because the HRP psychological evaluation is a resource intensive requirement, the DOE/NNSA enterprise would be well served by assessing the validity and efficacy of this and other elements of the program. The HRP is unique among similar programs across the Intelligence Community (IC) in that it requires participants to submit to a psychological evaluation prior to approval for the program and annually thereafter. As such, the HRP psychological evaluation provides a rich source of information for selection and ongoing eligibility through a process of continuous monitoring. Mining this data with the purpose of building predictive models for insider threat-related behaviors could prove invaluable in preventing and/or mitigating such risks to national security.
The current project revealed that the HRP psychological evaluative process at two DOE/NNSA sites identified elevated safety and security risk and led to actions likely to have reduced that risk. There was a consistent relationship of HRP psychological evaluation elements, including the semi-structured interview (SIS) and psychodiagnostic testing (MMPI-2-RF), with indications of elevated risk and with future safety and security-related events (e.g., arrest, IOSC, mental health event). In particular, increases in the total number of adverse workplace events were predictive of membership in two different criterion groups - those who were removed from HRP due to emergent issues and those who were found to have violated workplace rules. These findings were consistent with hypothesized outcomes. Verifying the predictive power of these risk indicators allows psychologists to proceed confidently with their decisions regarding an individual’s suitability for certification in the HRP. At times, the identified risks result in disqualification from certification. At other times, identified risks can be mitigated through various interventions. For example, acute yet treatable behavioral health issues can be resolved through referrals to an Employee Assistance Program (EAP) or other behavioral health services. Likewise, temporary financial problems can be resolved through a referral to financial counseling services or other community resources. Screening out those not qualified for HRP certification and rehabilitating otherwise qualified HRP participants are expressed goals of the regulation.
To some it may seem surprising that the SIS, in most regards, outperformed a well- validated psychodiagnostic instrument like the MMPI-2-RF in predicting adverse outcomes. There are likely two factors at play. First, due to statistical considerations noted above we were able to use relatively few of the MMPI scores in our analyses. Second, the SIS was designed to identify an individual’s history of behavior reflecting problematic judgment and reliability. Although the MMPI-2-RF assesses for factors that likely predispose individuals in this regard, the SIS assesses actual past behavior.
The PERIL project provides strong support for the validity and efficacy of the HRP psychological evaluation process as executed at two DOE/NNSA sites – Y-12 and Pantex. While there are differences in the evaluative processes across these sites, there are similarities in the overall approach and in the outcomes. The procedural differences may account for some of the variability in predictors associated with similar risk groups at each site. In the end, most of the elements of the HRP evaluation were associated with risk-related criterion groups at both sites. This serves as evidence that the HRP at Y-12 and Pantex identify and mitigate safety and security risk from passive and active insider threat associated with work in the nuclear weapons enterprise.
FUTURE RESEARCH
This study provided a rich source of data regarding the efficacy and validity of the HRP evaluative process as implemented at the two participating sites. As a result, the PERIL findings can inform the development of HRP best practices across the DOE/NNSA enterprise. In particular, the SIS was an invaluable source of data and consistently served as the strongest predictor of criterion group membership. The PERIL research team strongly recommends the use of the SIS (or a very similar type of semi-structured interview) for all HRP psychological evaluations. The electronic medical record system, EMBOS, is currently in use at six DOE/NNSA sites. The SIS is part of the pallet of services within EMBOS. As such, it seems logical that the SIS would be the standard tool for the semi-structured interview at least across these sites. In fact, developing a consensus about HRP best practices across DOE/NNSA nuclear weapons sites would improve standardization of HRP’s key process elements. There is a reciprocal relationship between the development of best practices and the analysis of data from the field.
Despite the magnitude of the association between the SIS data and criterion group membership, there are opportunities to improve this tool. For example, because time is a mitigating factor for many risk factors (e.g., a DUI arrest in 1984 seems less pertinent than a DUI arrest in the year prior to an HRP evaluation), PERIL researchers recommend the inclusion of time frames for adverse events (e.g., 10 years ago, 5 to 10 years ago, 1 to 5 years ago, within the last year). Ultimately, it would be useful to convene a team of HRP psychologists, statisticians, and information technology specialists to perform an in-depth review of the SIS to recommend modifications that would maximize its clinical utility as a valid measure of safety and security risk. Another goal of the endeavor would be to improve the functionality of the SIS as a means of capturing information for export into a database designed to build predictive models. Those models would, in turn, better inform the psychologists and better equip them to make decisions regarding HRP certification. Ultimately, this approach would lead to better detection and mitigation of a variety of risks including counterproductive workplace behavior and insider threat activities.
The HRP psychological evaluation addresses low-incidence, high-consequence behaviors and events. As such, the criterion groups at both Y-12 and Pantex were relatively small. Given the statistical modeling employed in this study, the number of predictor variables included in the analyses was limited by the size of the criterion groups. Thus, there were a number of evaluation elements that could not be included as potential predictor variables. This underlines the need for broader participation across the DOE/NNSA enterprise. In addition, data were collected at the Pantex site during a global pandemic. It is difficult to determine what, if any, impact this may have had on the differential findings between sites, but having additional data from multiple sites collected during the same time period would improve the comparability and generalizability of the findings.
At any given time, there are approximately 10,000 HRP-certified individuals. The PERIL study included participants across two of the DOE/NNSA sites. The participation of additional sites would significantly expand the dataset thereby facilitating the inclusion of additional predictor variables and the development of more powerful predictive models to identify and deter risk. In fact, the equations provided above (for calculating risk of an OSHA event) could be applied across multiple sites to determine risk of future criterion group membership across the full range of adverse safety and security-related events. By strengthening and standardizing the content and administration of the SIS across its sites and by increasing the number of participating sites, DOE/NNSA could dramatically expand the dataset, significantly increase the number of criterion group members, and maximize the power to predict these adverse safety and security-related events. The PERIL research team strongly encourages expanding HRP research endeavors to all DOE/NNSA sites. This would allow for a research-based approach, using predictive analytics, to better identify and mitigate safety and security risk.
DISCLAIMER
This work of authorship and those incorporated herein were prepared by Consolidated Nuclear Security, LLC (CNS) as accounts of work sponsored by an agency of the United States Government under Contract DE‑NA‑0001942. Neither the United States Government nor any agency thereof, nor CNS, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility to any non-governmental recipient hereof for the accuracy, completeness, use made, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency or contractor thereof, or by CNS. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency or contractor (other than the authors) thereof.
COPYRIGHT NOTICE
This document has been authored by Consolidated Nuclear Security, LLC, under Contract DE‑NA‑0001942 with the U.S. Department of Energy/National Nuclear Security Administration, or a subcontractor thereof. The United States Government retains and the publisher, by accepting the document for publication, acknowledges that the United States Government retains a nonexclusive, paid‑up, irrevocable, world‑wide license to publish or reproduce the published form of this document, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, or allow others to do so, for United States Government purposes.