Data Governance

Download PDF

1.0 Introduction

pivot's goal is to provide the best user experience to maintain, manage, and optimize enterprise AV systems by utilizing only trusted software and services to keep our customer's data safe and minimize risk. This document outlines details of pivot's data governance policies and the lifecycle of Data that comes in to pivot and how data is collected, managed, and archived. If you have any requests or questions, please contact the Data Manager.

2.0 Data Governance Committee

Data Manager Crystal Tang (Product Manager)

Data Architect Lindolfo Lacerda (Senior Backend Developer)

Data Owner Eric Rauch (Product Owner)

3.0 Critical Components

  • GitHub code: Stored in pivot repositories. Repositories are set to be private organizations and are not accessible publicly. Each developer has a local branch copy on their computers as well, covering all versions of the code as only they develop it.
  • AWS DynamoDB data: stored in DynamoDB, this data controls the output of the application. There are AWS backups set for the critical production tables that can be backed up to the second for a limited period (35 days). The tables are only accessible by employees who have been registered to Pivot's AWS organization, with MFA (multi-factor authorization) as an extra layer of access security management.
  • Processors/Devices that produce data logs, reporting to S3: All the critical data about clients and their devices comes through these log files produced. The devices not only write these log files, but they also send them into AWS S3 buckets, from which point the data can be assumed to be safe via AWS' various security protocols in collaboration with pivot's. There are two possibilities to be wary of:
    • The data being logged being sent in a timely manner and not backlogging and thus overloading the storage capacity of these devices.
    • The AWS private key that is sent with the files, stored on the respective modules.

4.0 Data Collection

There are three primary types of data that pivot collects: data about the customers, data about their facilities and technology within them, and how the technology/facilities are used. Each of the types of data collected is broken down below.

4.1 Customer data

Customer data is given to pivot when an agreement is made on providing the service – this is communicated directly from customer to the pivot team, regarding the locations to activate and the users who will be required for each location. It includes -

  • Company and location information
  • Users and their account information

4.2 Data about facilities and technology

Asset management data is uploaded either by pivot or the customer themselves, provided by the integrator who installed the room systems. It includes -

  • The floors, rooms, and the room layouts
  • The expected functions of the room type systems
  • The devices within rooms
  • The asset management information about devices (e.g., IP address, serial number, etc.)
  • Schedule maintenance history (i.e., firmware, device configuration)

4.3 How the facilities and technology are used

The majority of the remaining device/facility data is collected through log files sent from on-site processors or devices that are capable of doing so. These log files contain data of different data points programmed for each device, tracking whether they are in use and communicating with other devices in their respective system. The data in these files is processed through sets of proprietary rules and functions that determine whether they are performing as expected, which creates alert and troubleshooting data. It includes -

  • Live device data states and diagnostics as reported through its API.
  • Alerts based on systems not working as desired.
  • Service statistics in relation to the alerts.
  • Troubleshooting engine usage (through pivot).
  • Historical device/room issues.
  • Calendar Data, such as:
    • How often rooms are used.
    • What time they're used.
    • If they're scheduled to be used when not in a desired state of functionality.

For calendar integrations, pivot has read-only access to the calendar that a customer sets up. This information is limited to: meeting organizer and attendees, start time, end time, and room booked for the meeting. The calendar itself cannot be impacted by any functions performed by pivot on the data received as it is only reading the data and does not have any add/edit/delete capabilities

Any other data or metadata is a result of performing data science operations on the data mentioned above.

5.0 Data Management

pivot uses AWS as the infrastructure for all operations and is thus protected by the safeguards put in place by Amazon, alongside using best practices for each service that is involved in the platform. The Services that the data touches in AWS are:

  • EC2 (for Cisco codec devices)
  • S3 and S3 Glacier
  • Lambda
  • DynamoDB
  • Cognito and IAM (only for user management)

After the first stage of collection at the customer's on-premise processor/devices, the data is sent every 5 minutes through a secure link to Amazon S3 using a secret key and SHA256 encryption.

  • Once it is inside its respective S3 bucket, it is now within pivot's private AWS organization, and is protected by the highest possible securities set for the organization account with best practices implemented (such as AES 256 encryption).
  • From the S3 bucket the log file is processed through a Lambda function which parses the data into DynamoDB tables used to store the data displayed on the application. The parsed data is then placed in another S3 bucket to be moved into Amazon S3 Glacier later (mentioned in the following section, Data Archiving).
  • Cognito and IAM data is input by the pivot team when clients are first set up to create users and give them access levels and is unaffected by the data process involving the log files.
  • For Cisco codec devices, there is a slight difference in the way data is transmitted into AWS. The codec device sends the data over an HTTPS connection to Amazon EC2, along with a UUID (unique ID per device, coded into the program) as well as its serial number, which is reported directly from the device. As the serial number is coming directly from the device, it cannot be imitated, adding an extra layer of security. The UUID and serial number must match a device stored in DynamoDB for the data to be received. Once it passes these layers, the EC2 instance will create a log file of the data and send the file to S3 (within AWS) to continue the process from there.

There is no access for anyone other than a member of the pivot team on AWS to such data in S3, S3 Glacier, Lambda, DynamoDB, or Cognito/IAM.

6.0 Data Archiving

The majority of data that pivot uses for the day-to-day functions of the platform is the live data coming from devices and processor logs. This data is constantly changing as systems are used and is key to finding out when system alerts should be raised based on conditions and data values. The log files that bring in the device data can also be used in helping pivot troubleshoot complex issues customers face that may not be clear with just the live data available. For this reason, the data retention policy for the data log files being sent to Pivot's S3 buckets is as follows:

  • Receive log file in the respective S3 bucket and store here for 30 days

    • This is generally when there may be an issue that requires the pivot team to look back at historical logs of what happened before (and perhaps after) an incident occurs.
  • After 30 days have passed, transfer the file to S3 Glacier Storage for a further 60 days.

    • Glacier is Amazon's long-term archival service for flexible file types, best for situations when something needs to be stored and likely not need to be accessed, but still can be.
    • The files can be accessed through a request to S3 Glacier and still be used in the event that they are necessary for debugging or troubleshooting.
  • After the total 90 days have passed (30 in S3, 60 in S3 Glacier), expire and discard the log files.

    • At this point it is highly unlikely that pivot will need the actual log file itself, and it is safe to delete.

Data that is stored in DynamoDB has a different retention policy. As the log files are parsed into device data, this data is used to display on the platform. This data lives as long as it is not changed, in other words, until the same data point reaches a new value, it is stored in DynamoDB. Every time a data point changes it is logged in a file and sent to S3, thus it is updated based on how the real device works. This does mean, however, that there is no definite period of how long active device data is stored – it is there until it changes. For example, if a TV was powered off on January 1st, 2018, and stayed off until it was powered back on March 3rd, 2020, the data point would remain in DynamoDB that the TV is off until March 3rd, 2020. This is so that clients have an active state of all of their devices and are kept in the loop on what is connected/disconnected, on/off, etc.

Data that is used for the Analytics feature of pivot has its own policy as well. This data includes service data, troubleshooting usage data, room/technology utilization data, and alert data generated. This data is stored for a longer term to create accurate measures of comparison for users to know how they are doing versus the industry average, or what devices can cause the most issues, to help advance the troubleshooting engine to identify more resolutions and root causes, and so on. All data in relation to analytics is stored for 3 years after the data is received to help further the development of the platform and create better insights for users on how to manage and maintain their AV systems.