Refactoring (step up module architecture, establish central location for common functionality, cleanup linux analyzer) #58

luis261 · 2024-05-04T21:32:53Z

Module excecution improvements (for now only applied to the linux analyzer, other modules will be modified in dedicated PRs):

the simplest modules, which don't need any upfront input can simply be run via import
more involved modules shoud provide the following definitions:
- run: a function that takes in all required arguments as parameters, which would otherwise be provided via sys.argv if running the module directly in a shell (instead of inside Python via import)
- main: a function which wraps around run, executing it with sys.argv input
- an if statement that runs the main function if __name__ == "__main__"
in qu1cksc0pe.py, we can then import modules as required without the need for subshells/os commands (if they need input, we pass it via <modulename>.run)
this architecture let's us avoid the need for workarounds to pass inputs to analyzers, e.g. via the .path_handler tempfile (instead, we just pass any such needed information to a given module via a parameter of its local run function)

Note

changes below were already communicated/agreed upon previously:

Conversion of Modules directory into a Python package

this is worth it I think alone for sharing functionality (see below), but there is more to it than just that:

Having had enough experience myself with Python's quirky import system, I think I have some understanding for why it might be preferable to avoid dealing with it in some circumstances. Can you tell me the specific reasons though why you execute modules via os.system instead of staying inside a single Python interpreter instance if there are any?

If you're ok with it, I'll slowly move us towards staying inside a single Python instance by leveraging import statements instead of invoking modules directly via the operating system, since I think the advantages outweigh/justify using the established import system:

performance (launching a subshell via os.system every time we execute a module is more expensive in terms of overhead instead of simply importing into the current Python thread)
security (less risk of accidentally exposing/running different files on the system that aren't part of the program)

ideally we would rename the directory to be lowercase as part of this change to conform to Python module naming conventions .. I held off on it for now, since it's not strictly necessary and I'm unsure of the implications in terms of the other locations we'd have to adjust

Extraction of utility function previously strictly local to main module into separate module

might not seem sensible for now given that it only houses a single, tiny function
I'm sure it'll pay off in the long haul/soon though
- can continue extracting duplicate functionality there, to be reused across different modules

luis261 · 2024-05-04T21:34:10Z

Hello @CYB3RMX, I just need a yes/no from you for now on whether you're ok with my current trajectory so I can keep the changes coming. If not, let me know and I'll try to adjust.

CYB3RMX · 2024-05-04T22:14:12Z

Hello @luis261 !

Thank you for your work and support. I think this architecture will make the project's codebase more understandable and maintainable. I am also considering adjusting the codebase to make packaging this project easier for future processes.

luis261 · 2024-05-04T22:17:22Z

Hi @CYB3RMX (:

I'm excited to hear that! 🚀

I'll continue my work tomorrow and probably ping you in a few days when I've built up some more substantial changes (having the ability to import shared functionality definitely enables me to do that)

CYB3RMX · 2024-05-06T11:36:28Z

Hello again @luis261 !

When I was checking your changes, the code threw these errors.

Could you please also try to run program after your changes?

luis261 · 2024-05-06T15:31:10Z

Hi @CYB3RMX,

thank your for making the effort and trying to test the changes already!
Yes, you make a good point, there are still importing issues, as you discovered, this isn't ready to be merged yet.

Let me work on some other changes first, I already know the cause of the issue you're talking about.
That's what I meant with Python's quirky import system, it can be quite annoying to get it setup right.
(but once you do, it's worth it)

This issue is related to us running the modules via os.system at the moment instead of actual import's.
I'll get back to you once I've worked on some other, related changes in a few days. I estimate to get this in a state ready to be merged towards end of this week. I'll convert the PR from draft mode once it's ready (:

luis261 · 2024-05-08T17:19:49Z

I'm currently working on refactoring the linux analyzer as part of this overall change.

Just a small question only indirectly related to the refactoring: why do you consider it "bad" to have an NX bit?
I'm assuming you do because it is printed as red if present. Is that intentional?

Qu1cksc0pe/Modules/linAnalyzer.py

Line 167 in 7319b9f

nxstr = "[bold red]True"

CYB3RMX · 2024-05-09T07:28:17Z

I added this feature to check if the target sample has binary protections such as NX or PIE.

luis261 · 2024-05-09T08:24:00Z

Yes, I understand the point of the feature, I just don't get why we currently print "True" in bold red if NX is present, since it's a positive feature which increases security? Shouldn't it be green? I guess it's not that important, I'll address that point in a separate PR after the refactoring

CYB3RMX · 2024-05-09T08:30:03Z

I thought that if the malware sample has some binary protections, it wouldn't be good for us. By the way, did you run the program after your changes?

luis261 · 2024-05-09T08:37:50Z

The way I understood, having an NX bit is probably good overall and shouldn't affect the binary instrumentation process? https://ctf101.org/binary-exploitation/no-execute/

But I'm not sure, you know more about analysis than me and I guess it makes sense for it to remain as is.

Regarding the state of the program: yes, I did run it, although I don't have a great testing setup for Linux right now. I'm still working on refactoring the linux analyzer overall, to show you an example of what a refactored module could look like. I think I'll get that done today/tomorrow. After that, I'll fix the overall import situation (a simple, albeit annoying fix, because I'll need to modify the line in every module again)

CYB3RMX · 2024-05-09T08:50:27Z

The error is similar as the previous one. But dont worry we can fix this anyway.

luis261 · 2024-05-09T08:56:11Z

Yes, I know. I appreciate you trying to run it already, but currently there is no sense in trying that though. This branch will still be unstable until at least tomorrow. I will ping you though once it's ready ok?

CYB3RMX · 2024-05-09T08:58:32Z

Okay no problem. Thank you :)

luis261 · 2024-05-09T09:01:57Z

You're welcome, I like working on this project (:

Thank you for your patience and ongoing support!

luis261 · 2024-05-09T20:11:40Z

Just pushed a few more commits. Still not ready.

Making some progress though, the linux analyzer specific part of the refactoring is almost done now.

luis261 · 2024-05-09T20:54:04Z

In case you're reading this soon: please continue to ignore my changes for now. But it'll be ready soon, I'll probably keep working and move this out of the draft state before sunrise (we're in a similar timezone). Until then, I still have to:

finish cleaning up/fixing post-refactoring for the linux analyzer
take care of the import issue
explain considerations regarding the import setup
create a small summary/writeup for the refactoring so it's easier for you to review

CYB3RMX · 2024-05-24T11:09:49Z

Hello @luis261 !

I checked your changes and tried to perform a scan against a simple Linux sample. Everything worked properly in this case. After that, I installed Qu1cksc0pe on my system using the --install argument and tried to analyze the target sample again. After this, the program threw this error:

I think we should also do something for the utility/helper modules. We can import them from the sc0pe_base folder, I guess.

luis261 · 2024-05-24T11:22:06Z

Hello @CYB3RMX (:

thank you for your comprehensive testing, good finding, I'll look into it right away!

luis261 · 2024-05-24T11:35:31Z

@CYB3RMX

ok, so at first glance, I don't think I understand the installation process.

you copy the qu1cksc0pe.py file to /usr/bin/qu1cksc0pe here:

Qu1cksc0pe/Modules/installer.sh

Line 59 in e11dfe4

 cd $sc0pe_path && cp qu1cksc0pe.py /usr/bin/qu1cksc0pe && chmod +x /usr/bin/qu1cksc0pe 

but the Modules folder itself is never moved/copied right? So how can execution from the entrypoint in usr/bin ever succeed when modules are needed for carrying out analysis?

luis261 · 2024-05-24T11:40:35Z

Oh ok I understand now, it works via a config file:

Qu1cksc0pe/qu1cksc0pe.py

Line 78 in e11dfe4

 if os.path.exists("/usr/bin/qu1cksc0pe") == True and os.path.exists(f"/etc/qu1cksc0pe.conf") == True: 

That makes things a lot more complicated import-wise. I'll have to see if I can work something out this evening. But I don't think I can make it work without also copying the Modules directory over to /usr/bin. We'll see I guess, I might have some questions for you later.

CYB3RMX · 2024-05-24T11:41:58Z

Okay man :) Feel free to ask

luis261 · 2024-05-24T11:46:52Z

Thank you mate, I'll try to figure sth out once I'm off the clock and then present you with a list of options tonight/tomorrow (:

luis261 · 2024-05-25T07:18:09Z

@CYB3RMX

ok, so here's my current understanding of the install mechanism (Linux only, not taking Docker into account):

initially, the project obviously just sits at whatever location in the local filesystem which the installing user chose to clone it to
the user is expected to run setup.sh for proper setup
- this bash script installs various requirements; Jadx among other things, which is downloaded to /home/$USER/sc0pe_Base/jadx in particular
when executing the cloned qu1cksc0pe.py file with the --install switch, the installer.sh script under ./Modules/ is executed
- qu1cksc0pe.py passes the location of the cloned repo to the bash script
- the installer script creates a config file /etc/qu1cksc0pe.conf, containing a value that references a directory of location /opt/Qu1cksc0pe
- using the passed path, the cloned files are then copied to /opt/Qu1cksc0pe
- a copy of the "main" Python entrypoint is finally written to /usr/bin/qu1cksc0pe and marked as executable, owned by the current user

This means for a full Qu1cksc0pe installation running on Linux, the intended entrypoint is /usr/bin/qu1cksc0pe, as seen above in your screenshot. On execution, it looks for the config file under /etc and if present, utilizes the referenced path to execute modules as needed, lying under /opt/Qu1cksc0pe/Modules/. If no such config file is present, this indicates no proper install was carried out. As such, the entrypoint just runs against the current working directory and assumes the modules to execute are located there (this works for running directly in the cloned location or running the copy under /opt/Qu1cksc0pe/).

The problem here is that the entrypoint under /usr/bin/ can't find the Modules package, which sits at a different location(s).

Based on this understanding, here are the options, odered by rising complexity/disruptiveness to the current way of doing things:

since we're currently using "module-unaware" imports inside the Python modules anyway, we could just permanently stick to that way of importing. I'd have to remove the linAnalyzer import though and replace it with an execution via shell again. I'd also have to provide local copies of required utilities for the entrypoint (the analyzer modules can still share functionality via the locally importable module though). This would still be manageable, since the entrypoint is only using utils.err_exit, which could easily be defined inline instead.

advantages: no adjustments to the current way of doing things required
disadvantages: we'll be stuck on executing the Python modules in subshells, can never import them from the entrypoint directly; as such the entrypoint itself also won't have access to shared functionality defined in /opt/Qu1cksc0pe/Modules/utils.py

sys path hack: in the entrypoint, I could append the known location of the Modules package under /opt/Qu1cksc0pe/ to sys.path. That way, Python will be able to find the Module package when we try to import it (the current working directory is part of sys.path anyway, so the other cases are taken care off by default anyway).

advantages: targeted modification at arguably the right location/source of the issue, without affecting the rest of the project and avoiding the need for further workarounds
disadvantages: I consider this to be an OK solution; however, it's not considered clean from a Python perspective

place a simple bash entrypoint under /usr/bin/, which simply redirects to the actual location under /opt/, or falls back on the clone location

advantages: pragmatic solution, avoids duplication
disadvantages: unsure about the details of this, might introduce problems for passing around the file to analyze?

copy all project files to /usr/bin/

advantages: pragmatic solution, avoids duplicated entrypoints
disadvantages: idk, not the proper UNIX way of doing things? might not work?

I suggest we go with option 2, but let me know what you think please. Then I can implement the chosen option (:

…s directory to path

luis261 · 2024-05-25T09:55:35Z

@CYB3RMX

actually I took the liberty of just implementing the 2nd option already, that way you can see what I mean and if you're ok with the solution we can finally get this done (:

CYB3RMX · 2024-05-25T10:30:22Z

Exactly. The second option seems more logical. By the way, while adding updates, make sure to consider the --install part as well, my friend :)

luis261 · 2024-05-25T10:50:29Z

Awesome, thanks for the swift feedback 🤗

yes, I'll keep this projects ways of performing installation in mind for future changes.
Regarding this specific PR though, we should be fine now when it comes to the impact on the installation process right?

CYB3RMX · 2024-05-25T10:55:31Z

Absolutely. But our priority should always be ensuring that the program runs smoothly. This way, people using the program won't experience any issues :)

luis261 · 2024-05-25T11:07:25Z

That's true for sure, but I'm not sure I get what you're trying to say: do you mean just in general or is there something else I need to consider regarding this specific PR before we can hopefully get it merged?

CYB3RMX · 2024-05-25T11:59:54Z

What I mean is that we can approve the PR once we are sure everything is working correctly.

luis261 · 2024-05-25T12:17:30Z

That makes sense (:

Do you think you could run it again with the --install option for me, like you did above? Does not have to be now/today if you don't have time.

CYB3RMX · 2024-05-25T12:20:34Z

Alright. Send your changes so we can test them. If there are no problems, we'll complete the PR :)

luis261 · 2024-05-25T12:32:53Z

Awesome, thank you so much 🙏

I already pushed and committed ba52c65bf9e6a2d1fd2ab56e6457437b57f0267e this morning. That small change already constitutes the implementation of option 2, thus hopefully solving our problem (:

CYB3RMX · 2024-05-27T11:25:20Z

Hello @luis261 !

I checked your code, and it seems to be working properly (LinuxAnalyzer module). I also need to check the other modules.

Note: I checked the --console mode. When we try to analyze a simple sample, it throws this error:

That one is for the Windows side:

I think we should add a default value for the report section

luis261 · 2024-05-27T11:57:54Z

@CYB3RMX thank you so much for your effort and attention to detail! Good to know the current status, I'm glad to hear that at least the actual analysis internals of the linux analyzer are stable again (:

As for the report issue when executing the module directly: yes, that's on me, careless mistake to access that index without a prior check, here's the situation:

there already is a default value for the actual functionality in the analyzer class' analyze method signature
it's just failing right now because I'm accessing in an unsafe manner, thus causing the IndexError
I've just pushed an update that includes a length check, this should fix it (780bbe3)

def analyze(self, indicators_by_category, emit_report=False):
    ...

luis261 · 2024-05-27T12:05:03Z

I'm a bit disappointed (in myself) regarding the integration changes and delay this refactoring is causing.
Thank you for your patience though. Going forwards, I'll be able to avoid changes across such a wide breadth of modules to circumvent the need for such involved testing (across different modules, different ways of execution etc).

This means the risk for upcoming changes will be more predictable/of smaller impact or scope. For now, there was no way to address some points of improvement without affecting all modules though.

…n analyzers

luis261 · 2024-05-27T15:22:44Z

@CYB3RMX

Sorry about the earlier information overload, I investigated this further in the meantime, here are the facts, as compact as I can manage:

when filtering out any occurences related to "file", only 14 occurences of raw sys.argv indexing remain
I analyzed the context for each of these code snippets: it's not problematic in these cases because all of these analyzers only get called from qu1cksc0pe.py in a manner where all expected values are present (so sys.argv is long enough => no IndexError)

Overall, we should be fine now. Again though, these IndexErrors weren't introduced by my refactoring, as evidenced by: https://github.com/CYB3RMX/Qu1cksc0pe/blob/master/Modules/winAnalyzer.py#L643 (main branch)
(so the errors should also be reproducible on main at the current version: e11dfe4)

Qu1cksc0pe>git grep "sys.argv\[" | python txtfilter.py -x Modules/utils.py file get_argv
Modules/VTwrapper.py: apikey = str(sys.argv[1])
Modules/andro_familydetect.py:targetApk = sys.argv[1]
Modules/apkAnalyzer.py:targetAPK = sys.argv[1]
Modules/apkAnalyzer.py: if sys.argv[3] == "JAR":
Modules/apkAnalyzer.py: if sys.argv[3] == "DEX":
Modules/apkAnalyzer.py: if sys.argv[2] == "True":
Modules/email_analyzer.py:target_eml = sys.argv[1]
Modules/hashScanner.py: if str(sys.argv[1]) == '--db_update':
Modules/hashScanner.py: if str(sys.argv[2]) == '--normal':
Modules/packerAnalyzer.py: if str(sys.argv[1]) == '--single':
Modules/packerAnalyzer.py: elif str(sys.argv[1]) == '--multiscan':
Modules/pcap_analyzer.py:target_pcap = sys.argv[1]
Modules/powershell_analyzer.py:target_pwsh = sys.argv[1]
Modules/windows_dynamic_analyzer.py:target_pid = int(sys.argv[1])

CYB3RMX · 2024-05-27T15:27:18Z

No problem, man. I just meant to say let's come up with a solution for this too.

luis261 · 2024-05-27T15:28:16Z

Yes, thank you my friend (:

My most recent changes should fix the issues you found 🤞

luis261 added 2 commits May 4, 2024 22:04

Make modules directory a py package

3907cfc

Establish generic utils file, reachable for all analyzers

9ef5694

Provide bool choice console functionality in central place

1f90e9c

Use util funcs for simple code dedup in modules

494153a

luis261 added 2 commits May 6, 2024 16:55

Remove unused import

80b2d83

Fix err util: use rich.print instead of builtins.print

7319b9f

Add new utility fn for rich table inits

b37141b

Rename new bool str styling func

aaaacea

luis261 added 3 commits May 9, 2024 12:40

Add additional, high level table util

8018678

Add small util generator fn for sorting out blanks

3196102

Apply bunched up refactoring changes to linux analyzer

cb98265

luis261 added 3 commits May 9, 2024 22:15

Add docstring

ba87357

Fix new yara util statmeth

f80584a

Fix missing cast introduced by refactoring

a304f91

Compensate for nonlocal /usr/bin/ entrypoint by adding project file…

ba52c65

…s directory to path

luis261 changed the title ~~WIP: refactoring~~ Refactoring (step up module architecture, establish central location for common functionality, cleanup for linux analyzer) May 25, 2024

luis261 changed the title ~~Refactoring (step up module architecture, establish central location for common functionality, cleanup for linux analyzer)~~ Refactoring (step up module architecture, establish central location for common functionality, cleanup linux analyzer) May 25, 2024

Fix raw invocations: wrong condition; don't assume argv idx 2 exists

780bbe3

This comment was marked as outdated.

Sign in to view

Introduce new util for safe argv indexing and propagate to win and li…

d52fa20

…n analyzers

This comment was marked as outdated.

Sign in to view

Refactoring (step up module architecture, establish central location for common functionality, cleanup linux analyzer) #58

Are you sure you want to change the base?

Refactoring (step up module architecture, establish central location for common functionality, cleanup linux analyzer) #58

Conversation

luis261 commented May 4, 2024 • edited

Module excecution improvements (for now only applied to the linux analyzer, other modules will be modified in dedicated PRs):

Conversion of Modules directory into a Python package

Extraction of utility function previously strictly local to main module into separate module

luis261 commented May 4, 2024

CYB3RMX commented May 4, 2024

luis261 commented May 4, 2024

CYB3RMX commented May 6, 2024

luis261 commented May 6, 2024

luis261 commented May 8, 2024

CYB3RMX commented May 9, 2024

luis261 commented May 9, 2024

CYB3RMX commented May 9, 2024

luis261 commented May 9, 2024

CYB3RMX commented May 9, 2024

luis261 commented May 9, 2024

CYB3RMX commented May 9, 2024

luis261 commented May 9, 2024

luis261 commented May 9, 2024

luis261 commented May 9, 2024

CYB3RMX commented May 24, 2024

luis261 commented May 24, 2024

luis261 commented May 24, 2024 • edited

luis261 commented May 24, 2024 • edited

CYB3RMX commented May 24, 2024

luis261 commented May 24, 2024

luis261 commented May 25, 2024 • edited

luis261 commented May 25, 2024

CYB3RMX commented May 25, 2024

luis261 commented May 25, 2024 • edited

CYB3RMX commented May 25, 2024

luis261 commented May 25, 2024

CYB3RMX commented May 25, 2024

luis261 commented May 25, 2024

CYB3RMX commented May 25, 2024

luis261 commented May 25, 2024

CYB3RMX commented May 27, 2024 • edited

luis261 commented May 27, 2024 • edited

luis261 commented May 27, 2024 • edited

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

luis261 commented May 27, 2024 • edited

CYB3RMX commented May 27, 2024

luis261 commented May 27, 2024

luis261 commented May 4, 2024 •

edited

luis261 commented May 24, 2024 •

edited

luis261 commented May 24, 2024 •

edited

luis261 commented May 25, 2024 •

edited

luis261 commented May 25, 2024 •

edited

CYB3RMX commented May 27, 2024 •

edited

luis261 commented May 27, 2024 •

edited

luis261 commented May 27, 2024 •

edited

luis261 commented May 27, 2024 •

edited