- Print
- DarkLight
Purpose
The User CDP contains information on all users/employees within an organization. The User CDP is ideal for answering questions about the current and historical status of the organization's workforce and can be joined to OCSF event tables for further historical trend analysis.
Entity Correlation
The User CDP leverages the Entity Correlator to effectively track and manage the distinct users that DataBee has identified within the organization. The Entity Correlator utilizes several key fields to identify and differentiate users, listed below in their rank order of importance:
Employee UID: This unique identifier is the primary field used to distinguish each user within the system
Name: The username (i.e., userid) of the employee provides an additional layer of identification and helps in user recognition
Email Address: The email address serves as a crucial communication link and a final identifier
Understanding the Table
The User CDP is a Type 2 Slowly Changing table meaning that any modifications to an entity will result in the creation of new rows for that user ID, capturing the updated fields. The active
column for the older row is set to False
, ensuring that historical data remains unaltered and providing a clear distinction between current and previous states. Consequently, users can trace the evolution of the data over time, making it easier to perform audits and historical analyses.
End_time
is populated only if a user has been merged into another user by the Entity Correlator. In such cases, a new row is created to reflect the merged user, and the ‘end_time’ field is populated for all previous rows associated with the user. This ensures that there is a clear distinction between the historical data and the current state of the Entity Correlator. By tracking the end_time
, users can easily identify when a particular user entity was last active before being merged.
On the other hand, if a user’s attributes are merely updated without merging, the end_time
will not be populated for the older row. Instead, the active
field will be set to False
. This indicates that the older information is no longer current but still retains its historical significance.
Schema
Joining to CDPs and OCSF
If the security hygiene entitlement is enabled, user.id
can be used to join back to OCSF and CDP tables, otherwise another id like user.name
or user.email_addr
or user.employee_uid
should be employed.
Example Queries
Find all active users
SELECT * FROM user where active=TRUE
Find the latest record for a given id
SELECT * FROM user where id = {id} and active=TRUE
Find all active entities and their latest row
SELECT * FROM user where active=TRUE
How to join back to OCSF
SELECT * FROM CDP.user left join OCSF.{activity} on CDP.user.id = OCSF.{activity}.user_id