Ingest Methods
  • 09 Dec 2024
  • 3 Minutes to read
  • Contributors
  • Dark
    Light

Ingest Methods

  • Dark
    Light

Article summary

DataBee flexible ingestion pipeline supports ingestion from leading security platforms including Azure Active Directory, CrowdStrike, Azure MFA, Azure Sign-In, Palo Alto Networks, Zscaler and more.

Data can be ingested via

  • API - DataBee makes API calls to pull in events

  • HTTP - Platforms push data securely to DataBee via HTTP/S

  • Amazon S3 and Azure Blob - DataBee polls a storage bucket for events

  • Syslog - Logs can be sent to a DataBee Data Collector

  • Files - The DataBee Data Collector is able to read in files mounted into the file system

Managing data sources

Data sources refer to the various systems, applications, and platforms where data is generated or stored. DataBee is designed to extract data from these data sources, transform it as required, and load it into a target system such as Snowflake. In this section, we will explore the various data sources available in DataBee, how to add, delete, and edit data sources, and how to determine their current states.

Current data sources

Click on the Data button which takes you to the "Your current data sources" page where you can find all the data sources currently available within the tool. Some of the available data sources are Azure Active Directory, CrowdStrike, Asset Optics, Azure MFA, Azure Sign-In, etc. The page displays a clean and organized view of data sources which includes the data source name, its state, and the size of raw data with a graphical representation, along with the owner's name and e-mail.

The various states of a data source and their corresponding meanings are mentioned below. These show up on the "Data" page after you have added and configured your data sources.

Healthy: ingest is occurring without issue

Error: an error has occurred provisioning the data source (precise error indicated via message in UI)

Disabled: the user has opted to disable ingest flow (essentially turn off the spigot)

In Progress: the user has begun, but not completed configuring their data source

Processing: the user has completed configuring their data source and backend resource provisioning has begun

Unhealthy: this data source is processing less data than usual, historically. This may indicate a potential disruption in the data ingest.

image

Add new data source

You can click on the + Add New Data Source button, which takes you to the "Add new data source" page where you can select the data source you want to add by clicking on the > button.

image

Configure data source

This takes you to the "Configure data source" page, where you can enter the details required to connect to the data source. Make sure that you already have an account of the data source you are trying to connect. You can refer to the link provided within the page, for more instructions on setting up the data source. The steps for connecting a new log source are displayed on the right side of the page. Follow the steps and enter the details in the fields to configure the data source.

image

Delete data source

You can click on the trash can button to delete the data source you prefer to delete. A confirmation dialog box will appear asking if you are sure you want to delete the data source. Click on Delete in the confirmation dialog box to proceed with the deletion, or click Close to abort the deletion.

image

Edit data source

To make changes to the data source details you entered earlier, go to the "Your current data sources" page and click on the edit icon represented by an image of a pencil. This will take you to the "Configure data source" page where you can make the necessary edits.

Data source configuration history

Navigate to the “Your current data sources” page. Locate the desired data source for which you want to view the configuration history. Click on the Feed Configuration History icon, represented by the history symbol. The “Feed Configuration History” page pops up which displays all the configuration changes made for the selected data source. It displays the time at which the configuration change was made and the user or account responsible for the configuration change. Click on the tiny arrow button at the right, which displays the specific field that was modified, the old value before the change, and the new value after the configuration change. This feature helps you track and analyze the modifications made to the data source configuration over time.

data_config_history

Supported ingest file types

DataBee accepts CSV, TAR, and TAR.GZ as the default supported ingest file types. Files without these extensions are treated as single JSON objects or JSON arrays based on their content structure. If your files don’t match these formats, contact your support engineer for manual configuration adjustments.


Was this article helpful?

What's Next
Changing your password will log you out immediately. Use the new password to log back in.
First name must have atleast 2 characters. Numbers and special characters are not allowed.
Last name must have atleast 1 characters. Numbers and special characters are not allowed.
Enter a valid email
Enter a valid password
Your profile has been successfully updated.
ESC

Eddy AI, facilitating knowledge discovery through conversational intelligence