Data Viewer Introduction and Guide to Use

Initial version, Sept. 2024

Table of Contents: Data Viewer Overview Understanding Key Aspects of the Data Viewer Useful Tools and Features in the Data Viewer Appendix I: Full Feature List and Notes Appendix II - Region Definitions


Data Viewer Overview

These tools complement the CaRCC Capabilities Model with interactive data visualizations to explore the Community Dataset provided by institutions that are using and/or have completed assessments using the Model. The Model was developed to identify the variety of relevant approaches taken to support Research Computing and Data (RCD) and the key factors for providing that support. The Model is designed for use as an input to strategic decision making and is intended to be inclusive across a broad range of institutions.

The Data Viewer includes three types of views of the Community Dataset:

Who can use the Data Viewer?

This tool is for anyone who wants to understand the state of
Research Computing and Data support across the community
and explore aspects of that by topic, region, types of institutions, etc.

The Data Viewer is designed to be useful to a diverse mix of stakeholders, including campus research computing and data practitioners, the researchers and educators they work with, as well as key partners (e.g., central IT), and campus leadership. Users who are unfamiliar with the CaRCC Capabilities Model may wish to review the associated Introduction and Guide to Use.

What are common uses for the Data Viewer?

Acknowledgements

The RCD Capabilities Model was developed through a collaboration of the Campus Research Computing Consortium (CaRCC), Internet2, and EDUCAUSE. This work has been supported in part by the National Science Foundation via OAC-1620695 and OAC-2100003.

Back to top 

Understanding Key Aspects of the Data Viewer

The data includes self-reported capabilities across a range of different topics presented in the Model[1]. In addition, we have metadata about the institutions themselves that is used both to visualize the community of users, and to support filtering of the Community Dataset.

The key concepts that structure the data are:

  1. Institutional Metadata
  2. Institutional Populations
  3. The Five Facings
  4. Local Priority

Institutional Metadata

We have metadata about the institutions that are using the Model that we draw from the Department of Education’s IPEDS data[2], from the Carnegie Classification[3], and from data that users provide when they create a profile to use the Model. The metadata values are used to provide different comparative views of the data and to filter the data to understand or benchmark a particular subset of institutions.

Institutional Classification: Values are based upon the widely used Carnegie Classification with adjustments and extensions to support our community of users (e.g., to include additional types of institutions and those that are not U.S. Higher Education institutions).

Institutional Mission: this is a simpler classification that allows for grouping and comparison of institutions by their core mission (the balance of research and teaching)[4]. Note that the dataset does not include this value for all institutions, however we are working to expand coverage to as many as possible. Values include:

Research Essential: Research is the primary or exclusive mission, and teaching does not significantly factor into faculty and institutional success (Research Labs, National Supercomputing Centers, etc.).

Research Favored: Research and teaching are the primary missions, but research is what really drives faculty and institutional success (e.g., Research-driven Universities).

Balanced: Research and teaching are both primary missions, and they are equally important for faculty and institutional success.

Teaching Favored: Teaching is the primary mission, but faculty research is rewarded.

Teaching Essential: Teaching is the primary mission, and faculty research does not factor heavily in faculty and institutional success.

Public/Private: (a.k.a. “Control” of the institution). Note that we can accommodate private for-profit institutions, but few such institutions are participating to date.

EPSCoR Status: for higher education institutions in EPSCoR jurisdictions, as defined by the National Science Foundation (NSF)[5].

Minority-Serving Status: drawn from the US Department of Education Eligibility Matrix[6].

Region: The geographic region in which the institution is located. For a list of the regions and which states/provinces each includes, see Appendix II.

Size: The total number of students (graduate and undergraduate) as reported in the IPEDS dataset. Values are ranges (also defined by IPEDS).

Research Expenditures: The total US dollar value of research expenditures as reported by IPEDS; the most recent dataset we are using (as of 2024) includes values reported in the IPEDS survey for 2022. Caveat Lector: these reported values may be out of date and/or may differ from what institutions currently report (e.g., on their website).

Organizational Model: data provided by users when they create a Profile. Users choose the option that best describes how their institution's RCD services and staff are organized. Values include:

Centralized: Primarily within a central RCD/HPC group.

In a School/Dept.: Embedded within a single department or school.

Decentralized across units: Decentralized collaboration among several departments, schools, etc.

No organized support: No organized RCD support program currently exists.

Reporting Structure: data provided by users when they create a Profile. Users choose the option that best describes where within the institution their RCD program ultimately reports. Values include:

Information Technology, e.g., the Chief Information Officer (CIO)

Research, e.g., a Vice Provost for Research (VPR)

Academic Leadership (e.g., the Provost or a Dean)

Academic/Research Institute or Center

Other (not otherwise specified)

Back to top 

Institutional Populations

There are two populations represented in the data view graphs:

  1. All Users: This includes all institutions that have created a profile on the RCD Nexus portal, and/or that requested and downloaded an earlier version of the CaRCC Capabilities Model.
  2. Contributors: This includes only those institutions that have completed an assessment using the CaRCC Capabilities Model, and agreed to contribute their data to the Community Dataset.
    • Users affiliated with an institution that is a current Contributor have access to benchmarking functionality as well as some additional, more detailed data views.
    • Through the end of 2024, all institutions that contributed an assessment to the Community Dataset are considered current.
    • Beginning in 2025, only institutions that contributed an assessment within the previous three (3) years will be considered current.

For the Community Demographics visualizations, users can select which population they are interested in using the Population filter choices. The Capabilities Data and Benchmarking and Priorities Data visualizations are (of course) drawn only from Contributors.

Note that for institutions that have completed and contributed more than one assessment (i.e., who have repeated the process in more than one year), the Community Dataset only uses the most recent assessment data for that institution.

The Five Facings

The Model is organized into sections that reflect different roles that staff fill in supporting Research Computing and Data, and are named to reflect who or what each role is "facing" (i.e., focused on).[7] Within each facing, the model poses questions about aspects of research computing and data for the associated role; the questions are grouped into Topics.

Larger organizations may have a team associated with each facing role, while smaller organizations may have just a few people who cover these different roles. In filling out the assessment tool, you will likely want to involve people who work in the different roles; they can work to fill out their respective section of the assessment.

Facing Area

Description

Researcher-
Facing Topics

Includes research computing and data staffing, outreach, and advanced support, as well as support in the management of the research lifecycle.

Data-Facing Topics

Includes data creation; data discovery and collection; data analysis and visualization; research data curation, storage, backup, and transfer; and research data policy compliance.

Software- Facing Topics

Includes software package management, research software development, research software optimization or troubleshooting, workflow engineering, containers and cloud computing, securing access to software, and software associated with physical specimens.

Systems- Facing Topics

Includes infrastructure systems, systems operations, and systems security and compliance.

Strategy and Policy-Facing Topics

Includes institutional alignment, culture for research support, funding, and partnerships and engagement with external communities.

Table 1 - Description and examples for the Five Facings

Back to top 

Local Priority

Users of the assessment tool can optionally mark each capability as a priority for their organization. This can be used, e.g., to mark items they want to address as part of strategic planning. Priority values range from 1 to 99 (where 1 is the top priority).

These Local Priority values have no impact on the calculated coverage, however they may provide an indication of where RCD support organizations see challenges or opportunities and provide an additional lens on the state of RCD support across the community.

Back to top 

Useful Tools and Features in the Data Viewer

Benchmarking support: Check the Benchmark my Data option as shown in the image below to enable an overlay of your institution's coverage on any fo the Capabilities Data graphs.

Screen-shot excerpt showing a checkbox to enable benchmarking.

Note: This option will only be shown if you are logged in and your institution has at least one completed assessment.

Share a visualization: If you want to share a visualization with your colleagues, just copy the address bar contents and share the full URL via email, etc. Note: for benchmarking visualizations, users will have to authenticate to ensure they have access rights to the data.

Download your visualization as an image: If you want to use a particular visualization in a report or presentation, you can download an image of the view using the tools just above the upper-right hand corner of the graph, as in the image excerpt from the graph widget below:

Screen-shot excerpt showing a camera icon used to download an image.

Things of note in the user interface:

Back to top 

Appendix I: Full Feature List and Notes

The Data Viewer has three main areas of functionality:

  1. Community Demographics
  2. Capabilities Data and Benchmarking
  3. Priorities Data (coming in a future release)

Each of these shares some common functionality to filter and control the visualizations.

Community Demographics provides graphical displays of Information about the institutions in the dataset, including:

Capabilities Data and Benchmarking provides graphical displays of Capabilities Model Assessment Data in the Community Dataset, and (for users affiliated with current Contributors) the ability to benchmark assessment results relative to others in the Community Dataset.

A simple checkbox allows the user to overlay an indicator of their institution's coverage levels on any of the Capabilities Data graphs to compare their results to the community (or a subset thereof). To use this feature, users must be logged in and affiliated to an institution that has completed an assessment using the Capabilities Model, submitted it, and had it approved by the CaRCC data stewards. See the Quick Start Guide for a Step-by-Step example of benchmarking for your Institution.

Two levels of detail are provided:

A series of visualization types are provided:

Common functionality to filter data:

For each of the visualizations, users can filter the data to define and explore data about specific sub-communities. The full set of filters is described below, although not all filters are relevant to a given view (and will be hidden in those views).

Filters can be combined so that users can, for example, filter the view to include only Public institutions in EPSCoR jurisdictions with research expenditures under $200M.

If the specified filter(s) define a set of institutions fewer than five (5), no results will be shown. This is done to preclude incidentally (re-)identifying contributing institutions.

Note that filters are adjusted depending on the visualization chosen, since it makes no sense e.g., to filter on Public/Private if the user has selected a comparison by Public/Private status.

Common functionality to adjust visualizations:

Sharing a visualization: Once you have defined a view of interest that you would like to share, you can just copy/paste the browser URL to bookmark a view and/or to share with colleagues. Note: if the view includes benchmarking data, users will have to login to the portal to see the benchmarking data.

Benchmarking institutional coverage:

In addition to the display of community data, a user can choose to overlay values from an approved assessment for their institution to benchmark their capabilities coverage relative to a filtered set of peers. A user can benchmark at the summary (Facings) level, or drill down to benchmark on the topics in each Facing. (Support for benchmarking at the individual question level will be in a future release.) Note that in order to use the benchmarking functionality, a user must be logged in and must be affiliated to an institution that has a completed and approved assessment using the Capabilities Model.

Back to top 

Appendix II: Region Definitions

We use the IPEDS regions as defined here: https://nces.ed.gov/ipeds/search/viewtable?tableId=35945 and add two other regions: "Canada" (all provinces and territories) and "International" (for institutions in countries other than the U.S. and Canada).

We do not currently have Users or Contributors from the service academies*, and so do not currently include that IPEDS "region" in the interface.

New England region:

  • Connecticut
  • Maine
  • Massachusetts
  • New Hampshire
  • Rhode Island
  • Vermont

Mid East region:

  • Delaware
  • District of Columbia
  • Maryland
  • New Jersey
  • New York
  • Pennsylvania

Great Lakes region:

  • Illinois
  • Indiana
  • Michigan
  • Ohio
  • Wisconsin

Plains region:

  • Iowa
  • Kansas
  • Minnesota
  • Missouri
  • Nebraska
  • North Dakota
  • South Dakota

Southeast region:

  • Alabama
  • Arkansas
  • Florida
  • Georgia
  • Kentucky
  • Louisiana
  • Mississippi
  • North Carolina
  • South Carolina
  • Tennessee
  • Virginia
  • West Virginia

Southwest region:

  • Arizona
  • New Mexico
  • Oklahoma
  • Texas

Rocky Mountains region:

  • Colorado
  • Idaho
  • Montana
  • Utah
  • Wyoming

Far West region:

  • Alaska
  • California
  • Hawaii
  • Nevada
  • Oregon
  • Washington

Other U.S. jurisdictions:

  • American Samoa
  • Micronesia
  • Guam
  • Marshall Islands
  • Northern Mariana Islands
  • Palau
  • Puerto Rico
  • U.S. Virgin Islands

* U.S. service academies include the U.S. Naval Academy, the U.S. Military Academy, the U.S. Coast Guard Academy, the U.S. Air Force Academy, and the U.S. Merchant Marine Academy.

Back to top 

Footnotes

1 For more on the model, see the Introduction and Guide to Use.

2 IPEDS is the Integrated Postsecondary Education Data System. It is a system of interrelated surveys conducted annually by the U.S. Department of Education’s National Center for Education Statistics (NCES). https://nces.ed.gov/ipeds/.

3 Carnegie Classification of Institutions of Higher Education, https://carnegieclassifications.acenet.edu/

4 This simple taxonomy is based upon ECAR research by Bramanet al. (2006). IT Engagement in Research: A Baseline Study. ECAR. https://www.academia.edu/4420620/IT_Engagement_in_Research_A_Baseline_Study

5 See https://new.nsf.gov/funding/initiatives/epscor/epscor-criteria-eligibility for more information.

6 Drawn from the FY 2023 Eligibility Index: https://www2.ed.gov/about/offices/list/ope/idues/eligibility.html.

7 For more about the roles associated with these areas, see the initial draft of a “Research Computing and Data Professionals Job Elements and Career Guide”, available at: https://carcc.org/wp-content/uploads/2019/01/CI-Professionalization-Job-Families-and-Career-Guide.pdf.