1. Introduction
Morgan Stanley is a leading global financial services firm with offices in 42 countries and more than 80,000 employees. The nature of the business requires the insider threat team to regularly assess the insider threat program and report the status to multiple stakeholders including internally to upper management, business units, and auditors and externally to groups such as banking industry regulators. These various parties can possess different levels of knowledge regarding insider threats and may require information to be presented at different levels of technical detail. These requirements are not unique to the firm or to the financial services industry, as any large organization operating in a regulated industry will likely face similar challenges.
A maturity model is a framework for measuring an organization’s capabilities with regards to a specific discipline (Wendler, 2012). The basic concept is that organizational attributes develop through a number of logical stages (or levels) from an initial level to a more mature level (Gottschalk, 2008). Organizations can therefore use a maturity model to evaluate their capabilities against an external standard, and to obtain guidance regarding how they might improve. As such, a maturity model is an appropriate tool for the task of assessing and communicating the status of an insider threat program.
A number of maturity models have been published for use in areas such as process improvement and software engineering. The basic components of a maturity model are a set of focus areas, a set of maturity levels, and descriptions of the capabilities of the organization at each level for each focus area. Commonly, maturity models define between three and six levels. For example, the Capability Maturity Model (CMM) focuses on software development practices, and defines five levels: ‘initial,’ ‘repeatable,’ ‘defined,’ ‘managed,’ and ‘optimizing’ (Paulk et al., 1993).
In 2018, the National Insider Threat Task Force (NITTF) published their Insider Threat Program Maturity Framework (NITTF, 2018b). That document describes nineteen “maturity elements” (meaning capabilities or attributes) of insider threat programs. Those elements are organized into seven topic areas such as “program personnel” and “access to information.” However, the NITTF document does not define levels as in a traditional maturity model such as the CMM. Rather, the FAQ for the NITTF document describes the users of the model as being able to “choose among the maturity elements for those that best fit with their workplace environment, technology infrastructure, and mission” (NITTF, 2018a). In 2021, the Cybersecurity and Infrastructure Security Agency (CISA) in collaboration with Carnegie Mellon University’s Software Engineering Institute released an Insider Risk Mitigation Program Evaluation (IRMPE) tool to assist organizations with evaluating the maturity of their insider threat programs. That tool uses a fillable PDF that generates a report, and is intended primarily for “small and mid-sized [organizations] that may not have in-house security departments” (CISA, 2022).
While these existing models offer valuable insights and tools for organizations, a fundamental issue with using any single maturity model is that the content of that model inevitably reflects the views and preferences of the authors. The possibility also exists that the content might contain biases, omissions, or even errors. Therefore, the decision was made to design a maturity model that consolidates the recommendations from multiple best practice documents.
2. Design
Design science is an appropriate research paradigm when the goal is to create a specific artifact such as a maturity model in a real-world setting (Gregor & Hevner, 2013). Table 1 lists seven guidelines for design science research as specified by Hevner et al. (2004). These guidelines have been used by other researchers when developing maturity models (Wendler, 2012).
Of the seven guidelines, DS1 and DS2 focus on preparatory activities before the design phase; DS3 and DS4 are centered on evaluating the outcomes post-design; DS5 and DS6 pertain to the design phase, guiding the development and refinement of the model, and DS7 emphasizes the importance of communication after the design is completed. DS5 and DS6 are therefore the two guidelines specifically focused on the topic of design. DS6 refers to design as being a “search process,” with the goal of discovering an effective solution to the underlying problem. This requires knowledge of the requirements of the organization – which are described in section 1. DS5 refers to the need for “research rigor,” meaning the research should be conducted using appropriate theoretical foundations and research methodologies.
The research methodology selected for this work was the Design Science Research Methodology by Peffers et al. (2007). As shown in Figure 1, Peffers et al. provide a process model with six activities and four possible entry points. The six activities are intended to be carried out in nominally sequential order. Because an organizational need and specific requirements had already been established, and because design had not yet begun, the entry point was the ‘Design & Development’ entry point, leading to activity 3.
The first step in delivering activity 3 was to select the best practice documents that would be used to populate the model. A number of such documents exist. For this exercise, the eight documents listed in Table 2 were selected. These documents were chosen because the reputable nature of each source was likely to produce content of an acceptable quality, because each source was unique, and because time and resource constraints limited the number of source documents to approximately this number. The documents from FINRA and SIFMA are specific to the financial services sector, which was appropriate for Morgan Stanley. Other organizations might choose to select different source documents that are specific to their sector.
The eight source documents were reviewed to identify all recommendations – meaning all declarative statements regarding courses of action that organizations should take with regards to insider threat programs. 568 recommendations were identified and placed into a spreadsheet. Open coding was then performed, in the style of Grounded Theory (Glaser & Strauss, 1967). Coding of this type is commonly used in the analysis of qualitative data for security research (e.g., Alsowail & Al-Shehari, 2022; Krombholz et al., 2017). This exercise resulted in the creation of 20 categories relating to insider threat programs.
The categories were then consolidated where appropriate in order to reduce the overall number. This was accomplished by bundling conceptually related categories together, such as by bundling the ‘management,’ ‘governance,’ and ‘metrics’ categories together, and by bundling the ‘off-boarding’ and ‘terminations’ categories together. The resulting list of 13 categories was as follows:
-
Asset Management
-
Backups
-
Cloud Security, Network Security, and DLP (Data Loss Prevention)
-
End-User Reporting
-
Management, Governance, and Metrics
-
Identity & Access Management
-
Incident Response
-
Monitoring
-
Off-boarding and Terminations
-
Risk Management
-
Threat Intelligence and Information Sharing
-
Training & Awareness, and EAPs (Employee Assistance Programs)
-
Vetting & Onboarding
Given the above 13 categories, the next step was to select the maturity levels into which the 568 recommendations from the 8 source documents would be placed. Because of the relevance to the financial services industry in which Morgan Stanley operates, the maturity levels defined in the Cybersecurity Assessment Tool created by the Federal Financial Institutions Examination Council (FFIEC) were selected (FFIEC, 2017). The FFIEC Cybersecurity Assessment Tool defines five levels: ‘baseline,’ ‘evolving,’ ‘intermediate,’ ‘advanced,’ and ‘innovative.’ The description of each level is summarized in Table 3.
As the final step on the creation of the maturity model, the 568 recommendations in the 13 categories were evaluated against the 5 FFIEC maturity levels and placed into tables in a document with one table per category. Table 4 shows an example table from the model for the vetting & onboarding category, and Table 5 shows an example table from the model for the monitoring category.
This evaluation process was carried out by multiple individuals to reduce the possibility of individual bias. Where duplicate recommendations existed because of the use of multiple source documents, a single recommendation that best captured the spirit of the collective guidance was selected. As a result, the number of recommendations was reduced from 568 to 127. It was noted that as the recommendations from each source document were processed, the number of novel requirements presented by each additional source document declined substantially. As such, it appeared that the point of saturation was reached – meaning that little to no new recommendations would be forthcoming if additional source documents were added (Saunders et al., 2018).
The final document containing the completed maturity model is twelve pages long, which is a tractable length and considerably shorter and thus more manageable than many of the source documents. A visual representation of the end-to-end process used to design the maturity model is shown in Figure 2.
Proceeding with the subsequent activities in the Design Science Research Methodology, the description of activity 4 (“Demonstration”) is provided in section 3. The description of activity 5 (“Evaluation”) is provided in sections 4 and 5. Activity 6 (“Communication”) is delivered by this article as a whole.
3. Implementation
Distinct from the creation of the model itself is the assessment of the insider threat program within Morgan Stanley against the model. A streamlined approach with two phases was adopted for the assessment, with the goal of reducing the resource demand on both the insider threat team and the broader organization. This lightweight approach could potentially be expanded in the future, such as by acquiring material evidence of the status of each recommendation.
In the first phase, the insider threat team reviewed each recommendation in the model, and assigned a score (or label) of ‘high,’ ‘medium,’ or ‘low’ to depict the organization’s current state. A high score represented close parity between the description in the recommendation and the current state within the organization. For example, if a recommendation said to “regularly test backup and recovery processes,” and the backup and recovery processes inside the organization were indeed tested regularly, then the item would receive a ‘high’ score. Similarly, if the backup and recovery processes were tested but only infrequently, the item might receive a ‘medium’ score. A ‘low’ score would be assigned if no regular testing took place. This type of scoring is subjective, but was rooted in the knowledge and experience that the insider threat team held collectively regarding the organization. When performing this scoring, the spirit of each recommendation was engaged rather than the letter. For example, the recommendation in the ‘baseline’ level of the vetting & onboarding category in Table 4 refers specifically to EU GDPR (General Data Protection Regulation) laws, but the spirit of that recommendation was taken to mean compliance with all applicable laws.
In the second phase, subject matter experts (SMEs) within Morgan Stanley with specific knowledge of each category in the model then reviewed the scores assigned in the first phase. Depending on the availability of each SME, in some cases their scores were gathered in an interview setting with a member of the insider threat team present, otherwise the scores were provided by the SME via email. 10 scores (7.9%) were changed as a result of the review by the SMEs in the second phase.
A spreadsheet marking each of 127 recommendations as either high, medium, or low cannot easily convey the big picture of the maturity level of an insider threat program. In order to deliver that view, the visualization in Figure 3 was devised. In this visualization each colored box is a recommendation within the maturity model, with the color representing the score. The colored boxes are organized by row – which is the category, and by column – which is the maturity level. The categories and levels often contain a different quantity of recommendations because the underlying best practice documents do not offer the same amount of guidance on every topic, nor at every maturity level.
Figure 3 contains simulated data, but looking at the visualization immediately conveys the sense that the organization’s insider threat program operates at the upper-intermediate level overall. In terms of the costs versus the benefits received from spending on security controls, an organization would likely prefer to be assessed at the upper-intermediate or lower-advanced level. If the assessed maturity level was lower, this would mean that the organization had not yet implemented security controls and practices that have proven to be effective (hence their placement in the lower levels of the maturity model). But if the assessed maturity level was higher, this would mean that that the organization was an early adopter of emerging technologies and practices where the cost-benefit of those has not yet proven. An organization can therefore use the results of the assessment to calibrate their program.
With regards to the level achieved within each individual category, there are a number of possible interpretations. A strict interpretation might select the highest level at which all scores at that level and below are scored ‘high.’ A more permissive interpretation might select the level at which the inflection point is most pronounced.
4. Validation
The visualization in Figure 3 was presented to several stakeholders within Morgan Stanley. Positive feedback was received regarding the ability of the visualization to communicate both the maturity model and the assessment in a manner that was accessible to both technical and non-technical audiences. The visualization also facilitated detailed discussions, as a separate key was used to explore specific recommendations in depth. While it is true that the subjective satisfaction of stakeholders is a qualitative measure of validation, the overall approach was judged to have successfully met the requirements described in the introduction. Namely, to create a framework that could enable the insider threat team to regularly assess the insider threat program, and to enable the reporting of that status to various stakeholders who possess different levels of technical knowledge regarding insider threats.
A second way in which the assessment against the model has proven to be useful is for planning. Recommendations scored low or medium within the model are obvious candidates for future investments. However, an insider threat team within a large organization is unlikely to have direct functional responsibility for all the categories in the model. For example, backups, asset management, and identity & access management are all likely to be distinct and separate functions within a large organization.
To address this aspect, the 127 recommendations in the model were systematically classified using a Responsibility Assignment Matrix (or RACI) (Project Management Institute, 2013, p. 262). 66 recommendations were identified across 8 categories for which the insider threat team was either responsible (R) or accountable (A). 39 recommendations scored high were then removed, since those recommendations represent less opportunity for improvement. The remaining 27 items were then sorted into three buckets.
The first bucket contains recommendations where the recommendation is currently not implemented, and there are no plans to do so. For example, for legal and compliance reasons an organization might choose not to follow the recommendation to monitor the internet footprint of workers. The contents of this bucket are periodically revisited in light of changing circumstances. The second bucket contains recommendations that are actively being worked on. For example, an organization might be in the process of working to improve security monitoring capabilities. The contents of this bucket are reviewed on an ongoing basis to ensure that forward progress is being made. The third bucket contains recommendations that are novel, meaning work items that could potentially be undertaken. For example, an organization might have implemented personnel screening but not yet implemented continuous vetting. The contents of this bucket are reviewed as possible work items when carrying out future planning. Table 6 provides an example of the output of this planning process with simulated data.
5. Limitations
A fundamental limitation is that useful recommendations might exist outside of the best practice documents that were used to populate the maturity model. Best practices are conventional wisdom, by definition. There might therefore be useful guidance – perhaps relating to a fast-moving topic such as technology – that has not yet been recognized and codified as a best practice. In that sense, a maturity model based on current best practices might equip an organization to “fight the last war.” The faster the rate of innovation in the field of insider threats generally, the more pronounced this limitation becomes.
With regards to the assessment against the maturity model, in retrospect it would have been preferable to have the SMEs create their own scores, prior to being provided with the initial scores generated by the insider threat team. This would reduce the possibility of anchoring bias. A related matter is the fact that the scoring is entirely subjective from top to bottom, although the use of multiple knowledgeable parties to perform scoring should reduce response errors. Triangulation (the use of multiple research methods to study the same phenomenon) could be improved by using cognitive interviews to gather all SME scores, where the interviewee describes their engagement with each question.
Lastly, creating a maturity model using best practice documents and then performing an assessment against that maturity model means that the assessment is in effect being performed against the best practice documents. And it is possible that the recommendations within the best practice documents simply do not fit the requirements of a particular organization. Reviewing the original 568 recommendations reveals that topics such as personnel vetting, security monitoring, and training & awareness received more attention from the authors of the source documents (meaning more content and more detailed content) when compared to topics such as backups and network security. This might partially reflect something of an unconscious bias towards topics that are traditionally the responsibility of insider threat teams. It is not a brute fact that the topics favored by the authors of the best practice documents will necessarily provide the most benefit to an organization when compared to alternatives. Best practices should not always be assumed to be “one size fits all,” and each organization should customize its approach. To address this aspect, an assessment against a maturity model would ideally be paired with long-term data regarding operational outcomes such as the frequency and severity of insider incidents, and this would be a useful avenue for future research.
6. Conclusion
The insider problem has been described as “not one problem, but many” (Hunker & Probst, 2011). This aspect emerges in the recommendations provided by best practice documents. Those recommendations span a wide variety of topic areas including management, governance, metrics, network security, and backups, amongst others. Because of the quantity and breadth of these recommendations, and because there is no single definitive best practices document, organizations can face challenges with the task of assessing their insider threat programs. A maturity model is a useful approach for addressing these issues, because the format enables the recommendations from multiple best practice documents to be combined and reduced to a manageable size.
This article has demonstrated that the design and implementation of such a maturity model can be accomplished in a relatively straightforward manner by employing a structured methodology, and without requiring substantial resources. The specific model created by Morgan Stanley is one example of a model created by following that methodology. Other organizations can repeat the process to create models that best fit their needs – such as by selecting source documents that are appropriate to their specific sector. Future research could examine whether a general model can fit organizations in different sectors, or fit different kinds of organization such as government agencies and non-governmental organizations.
A visualization that summarizes the results of an assessment against the maturity model was also presented. In an organizational setting, the manner in which information is communicated can be as important as the information itself. At Morgan Stanley, the visualization has been found to be effective at conveying the results of an assessment to a wide range of audiences. Lastly, a maturity model can be used as an input to planning activities. An organization might set specific goals for the overall maturity level, or for the assessed levels within specific categories.
Maturity models are successfully employed in a variety of fields, including software development, process improvement, and project management. This article is a contribution towards the more prevalent use and study of maturity models for the insider problem.