- Access exclusive content
- Connect with peers
- Share your expertise
- Find support resources
05-17-2024 09:13 AM
I'm using COLLECT parsing rule to manipulate data at broker VM level before ingestion
Rule basically filters out on raw log that I generate specific to my test like some log line that contains text criticalevent along with some date and random machine name.
[Collect: vendor="unknown", product="unknown", target_broker=(mybroker), no_hit=drop]
filter _raw_log contains "criticalevent"
|alter a= someregex fn
|alter b=someregex fn
[Ingest:vendor="unknown", product="unknown", target_dataset="my_parsed_logs", no_hit=drop]
fields a,b,c ..
Now the resulting dataset gets all data and not the filtered data. If I put same filter condition inside ingest section then it works. But does that mean it happened at broker vm or at xdr side..
Is there something missing her
Coz, If I directly do Ingest without doing collect and directly into the same dataset then it gives desired result. But I don't think it happens at broker. Like for e.g.
[Ingest:vendor="unknown", product="unknown", target_dataset="unknown_unknown_raw", no_hit=drop]
Filter _raw_log contains "criticalevent"
Am i missing something here in understanding it??
05-27-2024 01:13 AM
Hello @Fm12345 ,
Thank you for reaching out on Live community.
Would like to clarify few things first of all.
Ingest: An INGEST
section is used to define the resulting dataset.
https://docs-cortex.paloaltonetworks.com/r/Cortex-XDR/Cortex-XDR-Pro-Administrator-Guide/INGEST
Collect: A COLLECT
section defines a rule that enables data reduction and data manipulation at the Broker VM to help avoid sending unnecessary data to the Cortex XDR server and reduces traffic, storage, and computing costs.
https://docs-cortex.paloaltonetworks.com/r/Cortex-XDR/Cortex-XDR-Pro-Administrator-Guide/COLLECT
Below is the sample which you can refer and correct your query as per the need.
[COLLECT:vendor="Apache", product="ApacheServer", target_brokers = (bvm1, bvm2, bvm3), no_hit = drop]
alter source_log = json_extract_scalar(_raw_log, "$.source")
| filter source_log = "WebApp-Logs"
| fields source_log, _raw_log;
[INGEST:vendor="Apache", product="ApacheServer", target_dataset = "dvwa_application_log"]
alter log_timestamp = json_extract_scalar(_raw_log, "$.timestamp")
| alter log_msg = json_extract_scalar(_raw_log, "$.msg")
| alter log_remote_ip = json_extract_scalar(_raw_log, "$.Remote_IP")
| alter scanned_ip = json_extract_scalar(_raw_log, "$.Scanned_IP")
| fields log_msg ,log_remote_ip ,log_timestamp ,source_log ,scanned_ip , _raw_log;
Incase any further assistance is required, please feel free to reach out.
Regards
Click Accept as Solution to acknowledge that the answer to your question has been provided.
The button appears next to the replies on topics you’ve started. The member who gave the solution and all future visitors to this topic will appreciate it!
These simple actions take just seconds of your time, but go a long way in showing appreciation for community members and the LIVEcommunity as a whole!
The LIVEcommunity thanks you for your participation!