Leveraging Linux Internals to Supercharge Osquery Malware Detection
Using /proc to find fileless malware
Introduction
This post outlines what I believe to be a novel way to overcome the limitations of the osquery yara scanning table to find fileless malware on Linux operating systems.
Background
What is osquery?
osquery is a powerful open source toolset that exposes operating systems in a way that allows them to be queried with SQL. There are myriad use cases of this instrumentation, but we primarily use it to ask security relevant questions of our hosts. Over the last 10 years, many developers have enhanced the capabilities of the tool and many companies have grown around providing advanced services and features for the platform. For the purposes of this post, we are mainly concerned about the osquery YARA scanning module.
What is YARA?
YARA is a pattern-matching language primarily used for malware identification and detection. YARA rules allow InfoSec researchers to classify malware based on certain characteristics or patterns such as strings or byte sequences. The flexibility and performance of the YARA language enables very fast and accurate scanning with relatively low performance impact compared to traditional AV or operating system (OS) based pattern matching like grep.
What is '/proc'?
In Linux, the /proc filesystem is a virtual/abstract file system that provides an interface to kernel data structures and information about the OS. It doesn't represent actual files on disk but rather dynamically generates information about the kernel, processes, and various system parameters. Essentially, /proc is an example of how “everything on Linux is a file".
How is YARA used in osquery
osquery supports a YARA table. This table takes in a mandatory file path argument, as well as a rules definition. The documentation can be seen here. The out-of-the-box behavior of this table means that there must be a file on disk for you to scan with the YARA rule, and this feature works very well.
The required file path argument is also quite a limitation to detection engineers because many threat actors employ fileless or in-memory malware to hide their implants and ultimately, avoid detection. Thinking about this problem, I came up with a novel way to leverage the virtual Linux filesystem 'proc' to satisfy the file path requirement while also scanning for fileless implants.
The Problem
There are a variety of ways to hide a process on the Linux OS. For the purposes of this article we will wave our hands and assume our attacker has either dropped a file, executed it and unlinked the binary from disk, or has utilized the dynamic loader via dlopen()
to shove the bytes into memory prior to execution. Either way, there is now an unknown malicious process running on our machine, and the file backing it is no longer on disk.
Using osquery itself, we can usually find these processes:
SELECT * FROM processes WHERE on_disk=1;
This very basic query will return all processes where the original binary is no longer present on the machine. For more information, there are many blogs that go into great detail about finding orphaned processes using osquery and other tables and heuristics.
Knowing that there is an unlinked process running on our machine is great, but how can we prove that this orphaned process is actually malicious? Typically, we could run another osquery against this process using the YARA table. Again, we will skip the deep technical details about choosing the right YARA rule, and provide a basic example.
SELECT * FROM yara where file_path="..." AND sigrule IN ('
rule bad_stuff:
{
meta:
description = "This is just an example"
threat_level = “midnight”
strings:
$a = {41 41 41 ?? 41 41 41 41}
$b = "I am a malicious file"
condition:
$a or $b
}')
AND matches = ‘rule_bad_stuff’;
The problem for our detection engineer should now be apparent, how can we provide the mandatory file_path
argument to the yara query if there is no file on disk?
The Solution
Putting together the aforementioned technologies, we can combine osquery + yara + proc to scan for in-memory and fileless malware on Linux. To do this we specifically target the /proc 'map_files' directory for a given process (pid). The map_files abstraction at a very high-level is a physical copy of memory-ranges for a processes memory mapped regions into "files" on disk.
Caveat: this solution may work in cases where the malware uses mmap()
to set memory RWX and writes the ELF directly to memory prior to jumping to it. This technique will not necessarily yield files in the same way.
Example ls
output for a random /proc/<pid>/map_files/
directory:
| lr-x------ 1 root root 64 Jan 11 06:40 7f8f80403000-7f8f80404000 -> /lib64/libc-2.5.so
| lr-x------ 1 root root 64 Jan 11 06:40 7f8f8061e000-7f8f80620000 -> /lib64/libselinux.so.1
| lr-x------ 1 root root 64 Jan 11 06:40 7f8f80826000-7f8f80827000 -> /lib64/libacl.so.1.1.0
| lr-x------ 1 root root 64 Jan 11 06:40 7f8f80a2f000-7f8f80a30000 -> /lib64/librt-2.5.so
| lr-x------ 1 root root 64 Jan 11 06:40 7f8f80a30000-7f8f80a4c000 -> /lib64/ld-2.5.so
Using this knowledge of Linux internals, we can now actually provide a file_path argument to our YARA query.
Finding a Sliver Implant
Sliver is a well-known post-exploitation framework developed by Bishop Fox. It offers advanced capabilities and methods commonly used by red teams and attackers alike. Using a slightly modified version of an open-source YARA rule we can find the latest version of Sliver running in-memory on Linux.
Once we have proven that our rule can successfully detect a single process that we know to be Sliver, we can craft a query that can generically detect any Sliver process without knowing the pid. Essentially, we can now use osquery to create a generic Sliver implant detection.
Example Query
Based on https://github.com/Immersive-Labs-Sec/SliverC2-Forensics/blob/main/Rules/sliver.yar.
SELECT *,
(SELECT pgroup
FROM processes p
LEFT JOIN process_open_files pof ON p.path = pof.path
WHERE on_disk = 0
AND name NOT IN ("agetty","dhclient","rpc.statd")
AND name NOT LIKE "systemd%") AS pid
FROM yara
WHERE path LIKE '/proc/' || pid || '/map_files/%'
AND sigrule IN ('
rule sliver_binary_native {
strings:
$sliverpb = "sliverpb"
$bishop_git = "github.com/bishopfox/"
$encryption = "chacha20poly1305"
condition:
(
(uint32(0) == 0x464c457f)
)
and $sliverpb and $bishop_git and $encryption
}
rule sliver_memory {
strings:
$s1 = "sliverpb"
$s2 = "*PivotListeners"
$s3 = "(*Register).GetActiveC2"
condition:
$s1 or ($s2 and $s3)
}'
) AND (matches = 'sliver_memory' OR matches = 'sliver_binary_native');
The magic of this query is in the sub-SELECT that dynamically generates a list of suspicious PIDs for us. By leveraging the processes.on_disk
column we can surface all running processes that are not backed by a file. Note: this can also be seen if you look at /proc/<pid>/exe
and the value is listed as "(deleted)".
Once we have our list of process identifiers, we can use the SQLITE CONCAT()
operator "||" to dynamically craft our file_path arguments to feed into the rest of the query. From here, we just run the typical osquery YARA query with our sigrule value defined.
From a defenders perspective, if this query hits your environment, you would now have a process ID that you can trust with high confidence is in fact malicious. This post shows how versatile and powerful osquery really is as well as the constant back and forth battle between attackers and defenders. I hope you learned something and are inspired to push the limits even further.