Data Exfiltration / Large file uploads

Luc_Desaulniers · ‎09-27-2021

Hello community,

I was wondering if anyone found an efficient query to look for data exfiltration/large file uploads?

I'm looking more from a threat hunting perspective, where I would want to trace one or multiple file being uploaded to a remote destination.

Right now the only way I've found is to correlate file read actions in the same timeframe of a network session to a remote site. But this isn't 100% reliable way and that wouldn't hold in court as evidence since at the end of the day it is just a file read.

Anyone has any suggestions?

Thank you

Luc D.

Anatoli_Kalysch · ‎07-31-2023

@Luc_Desaulniers I think reconstructing the traffic, as you pointed out, would be the most efficient way to solve this. The prerequisites you have outlined already. For the sake of completeness, would a taint tracking approach only using the endpoints be also possible?

- taint the files that were classified as confidential

- propagate the taint for interactions with the files, taint the memory locations of the applications that read / opened the files

- if during a network transfer these tainted memory locations are used in a interaction with a socket assume a leak

These points are highly complex of course and will probably require analysis of the applications in question. Some apps might do additional optimizations in how and where they map structs. Comparing the resources needed for the network-based approach and this taint tracking approach the network-based approach would win hands down. If there is no control over the network or the necessary controls are not available in the network then taint tracking would be the last resort that I would still see to create some form of forensic evidence.

View solution in original post

malalade · ‎09-27-2021

Please check out

1. XQL Query library: you can search for "upload" to see all related queries like Large FTP Sessions, Curl uploading more than 1MB etc

2. XDR Analytics currently do large uploads computation if there is applicable data. It triggers when endpoint transferred an excessive amount of data to an unpopular destination.

Please check-out:

https://docs.paloaltonetworks.com/cortex/cortex-xdr/cortex-xdr-analytics-alert-reference/cortex-xdr-...

You can get more interesting data if you have enhanced application logging from NGFW. Sample fields (like session upload) from the session detail is attached.

Luc_Desaulniers · ‎09-27-2021

Hi Malalade,

Thank you for the info, but this doesn't really answer my question which is how can you identify what data was uploaded.

Let me know if you can think of anything. Right now I go with file reads from the process generating the large upload, but this is far from a 100% science, was wondering if others figured out other more efficient ways.

Thanks

Luc

MrDuck · ‎04-08-2022

Were you able to find a solution? Having a similar issue where I want to pull all large file uploads

eluis · ‎04-11-2022

Hi @MrDuck, @Luc_Desaulniers

what @malalade answered with he links to the alerts is how you will see the uploads and be able to identify which uploads were those since this information will be in the alerts and incidents created.

KR,

Luis

Luc_Desaulniers · ‎02-10-2023

While I appreciate the answer that was provided, it doesn't answer the original question, which is how do you determine what data(files) were actually uploaded.

I know how to see the alerts, but those alerts don't contain file names/locations. Only session details, which I usually use to kind of figure out through the file reads actions during the timeframe, but was wondering if others found better ways?

Panagiss · ‎06-05-2023

@Luc_Desaulniers Did you find a way to achieve what you describe? I have the same questions here.

Luc_Desaulniers · ‎06-13-2023

From a couple of discussions both with PA folks and with other Incident response folks in the industry, you can't do a 1 to 1 on this type of activity. You will basically have to rely on file read actions and "assume" that those are the files that were possibly exfiltrated. At the endpoint level that's as granular as it will get based on what is being logged. The only way to know for sure would be to have SSL inspection/decryption outbound at your edge device and/or be able to replay the traffic decrypted in order to see what data is getting out.

If anyone has other suggestions feel free to share.

Thank you

Anatoli_Kalysch · ‎07-31-2023

@Luc_Desaulniers I think reconstructing the traffic, as you pointed out, would be the most efficient way to solve this. The prerequisites you have outlined already. For the sake of completeness, would a taint tracking approach only using the endpoints be also possible?

- taint the files that were classified as confidential

- propagate the taint for interactions with the files, taint the memory locations of the applications that read / opened the files

- if during a network transfer these tainted memory locations are used in a interaction with a socket assume a leak

These points are highly complex of course and will probably require analysis of the applications in question. Some apps might do additional optimizations in how and where they map structs. Comparing the resources needed for the network-based approach and this taint tracking approach the network-based approach would win hands down. If there is no control over the network or the necessary controls are not available in the network then taint tracking would be the last resort that I would still see to create some form of forensic evidence.

Luc_Desaulniers · ‎09-28-2023

Thank you for your input and I believe that strictly from an endpoint perspective this is as far as it can get to forensics evidence.

Unlock your full community experience!

Data Exfiltration / Large file uploads

Data Exfiltration / Large file uploads

Show your appreciation!