Another reason not to use python?

mark_j · Jul 31, 2021

This could possibly go in the scripting topic.

According to these folks (jfrog.com) and to quote:

As part of an ongoing effort by the JFrog security research team (formerly Vdoo) to automatically
identify malicious packages, we are now reporting several Python packages hosted on PyPI as
malicious. We have alerted PyPI about the existence of the malicious packages which promptly
removed them.

Python developers are being targeted with malicious packages on PyPI

JFrog finds a new supply chain attack targeting python developers using the PyPI repository

jfrog.com

This is an on-going issue with repositories (and has been since Perl years ago) and languages that are based on this model.

Trust is something we often give out when probably we shouldn't.

For developers to assume the code they pull in is safe is just reckless. End-users, unfortunately, are at the mercy of packagers and port maintainers.

I did manage to find a reference to the affected packages, but I am not sure this is a definitive list:

Software downloaded 30,000 times from PyPI ransacked developers’ machines

Expect to see more of these "Frankenstein" malware packages, researchers warn.

arstechnica.com

a6h · Jul 31, 2021

mark_j said:
Trust is something we often give out when probably we shouldn't.

This is how they are growing their gigantic libraries, i.e. the main reason behind its popularity. If you want a Cobol (Python) job, you have to learn Cobol (Python). I understand that. There's nothing wrong with learning multiple languages -- use it; dump it. But there's a Python-fad too; mostly around ML and hacking-stuff!

A retro-way of doing things:
1. C and/or Shell (sh, sed, awk).
2. Too much work! Use Perl.
3. Well-paid jobs? Learn the new language, even if it's FoxPro 2.6.

drhowarddrfine · Jul 31, 2021

vigole said:
there's a Python-fad too;

And now Rust, too.

Tieks · Jul 31, 2021

mark_j said:
For developers to assume the code they pull in is safe is just reckless.

Right you are. I look at the code before ever using that stuff, the examples from the first link are obviously wrong imo. Something like joy = '\x72\x6f\x74\x31\x33', why not simply joy = 'ROT13'?
I'm definitely off with eval(compile(base64.base64decode(eval(...)))). It's almost beautiful.

hardworkingnewbie · Jul 31, 2021

Not the first time such a thing happened to a well enough established programming language with a central code/library repository, though.

But that's not a valid reason to not use that language. Just a sign that you should pay more attention to whom you trust, as always.

For me though Python is harmless, because the cake of stupid lazy ass programmers lies clearly within Node.js and their npm repository. These people are too dumb to program their own basic string manipulations. And that's not an overexaggeration.

A legally threatened developer named Azer Koculu pulled in 2016 around 250 of his modules from Npm, including something called left-pad. Some of his stuff was being used in a messenger called Kik, to which he had no ties, but he felt enough pressure to pull off all of his modules.

Left-pad was just this snipped of code, nothing more:

Code:

module.exports = leftpad;

function leftpad (str, len, ch) {
  str = String(str);

  var i = -1;

  if (!ch && ch !== 0) ch = ' ';

  len = len - str.length;

  while (++i < len) {
    str = ch + str;
  }

  return str;
}

So all it does is to pad out the lefthand-side of strings with zeroes or spaces. No rocket science. This modulie back then had almost 2.5 million download per week. And really something at beginner programming level as well.

So when he pulled it there was a nasty fallout, because suddenly lots of stuff around the world was broken when people were trying to install all fancy npm stuff, because that dependency was now unresolvable. This was so enormous, that the maintainer of NPM restored a version of it quite fast enough.

Here's a nice writeup about the incident: https://www.davidhaney.io/npm-left-pad-have-we-forgotten-how-to-program/

Or that case when one guy named Dominic Tarr, the author of event-stream, didn't want to support this any longer and handed over the maintainer ship to another guy. event-stream has also around 2 millions downloads per week. And the new author used it to spread malware. Nice writeup here.

NPM also back then introduced other ~~innovative~~ hideous stuff as well in its update process, like - uhm - ads on console in its command line interface. Well, they quickly shelfed that then.

And then the inevitable happened: npm was acquired by Microsoft. What could possibly go wrong...

mark_j · Jul 31, 2021

Yes, don't get me wrong, every language, where you pull in code from elsewhere to fulfill X function without vetting the code is real bad.

However, these languages like python that are so-called multi-paradigm are designed so users/developers can just pull in what they want without any thought. What could possibly go wrong? You've given us another very good example.

It's all fun and games until you're hacked.

Alain De Vos · Aug 1, 2021

Can't the same thing happen using perl5 or ruby ?

Tieks · Aug 1, 2021

Alain De Vos said:
Can't the same thing happen using perl5 or ruby ?

I've seen similar things in Perl .pm modules. You just have to check when you pick up something from a repository somewhere.
Don't get me wrong, there are many well-programmed and useful modules out there. It's just that not all of them are.

Crivens · Aug 1, 2021

mark_j said:
Yes, don't get me wrong, every ~~language~~ developer, where you pull in code from elsewhere to fulfill X function without vetting the code is real bad.

FTFY

kpedersen · Aug 1, 2021

Of course the developers fault. Though I suppose languages with their own package manager kind of encourage this kind of careless behavior. NPM, Crates.io, CPAN (incl LaTeX), Gems, etc all breed terrible behavior. It is so rare that a program pulls in one or two dependencies like a typical C or C++ program. They *always* without fail seem to drag in dozens of little bits of cruft.

This and bindings. The world is built on C, so I suppose other languages needing bindings to wrap the native libraries are at a disadvantage when it comes to dependencies. I.e if Rust had a tiny C compiler added to it so it could consume C libraries directly, it would be a much stronger contender to C++.

Alain De Vos · Aug 1, 2021

Sometimes I code in D-lang. And when i need binding for let's say postgresql it pulls in numerous of libraries which are not postgresql related.
They all fail at the same place.

kpedersen · Aug 1, 2021

Alain De Vos said:
Sometimes I code in D-lang. And when i need binding for let's say postgresql it pulls in numerous of libraries which are not postgresql related.

Can D-lang not use C includes directly? Just like with C++, you lose a bit of safety or an idiomatic API but you do benefit from simplifying the solution as a whole.

Alain De Vos · Aug 1, 2021

You can call with extern-C, and then it follows the C-calling convention. This works fine for simple libraries.
It fails when you have complex include files with lot's of macro's and defines. Or when C++ is used.
I should try it once with for instance libpq.

astyle · Aug 1, 2021

A good expression for all that - the devil is in the details. When a program gets big enough to do anything of interest - that's when it becomes easier to hide malicious intent among mistakes. And then you wonder why we have tight control over who actually gets to be a committer - just remember the mess that University of Minnesota got itself into, with a supposedly malicious commit that exposed holes in how kernel patches get submitted.

mark_j · Aug 1, 2021

Alain De Vos said:
Can't the same thing happen using perl5 or ruby ?

Absolutely, that's why I mentioned it. When I was using perl (a lot) a long time ago this always worried me. You pull in modules and then run them as root or other high privilege and don't give it a second thought.

Obviously the entire open source ecosystem is a potential victim to this issue because we share but often don't care about the code we incorporate. This seems especially true with junk like javascript.

hardworkingnewbie · Aug 2, 2021

mark_j said:
Absolutely, that's why I mentioned it. When I was using perl (a lot) a long time ago this always worried me. You pull in modules and then run them as root or other high privilege and don't give it a second thought.

Obviously the entire open source ecosystem is a potential victim to this issue because we share but often don't care about the code we incorporate. This seems especially true with junk like javascript.

Well you don't have to use Javascript for such examples. Any programming language, where for some tasks is a standard library established well enough so it's practically everywhere but not many have a look at its internal status quo, and only few people actually have the knowledge to dive even deeper to be able to judge if the algorithms are implemented correctly or not will fit that bill.

For me the prime example is OpenSSL, when the heartbleed exploit was being discovered. This lead to the LibreSSL fork, and boy oh boy, what they did found in the source code was disgusting.

Crivens · Aug 2, 2021

The problem is worst when the code loads its dependecies on its own. You have no way to validate a build. Now for that safety critical stuff, like in medicine...

fbsd_ · Aug 2, 2021

Actually there is no reason exist to not use any language while they are being useful for that work. There can be python packages that contains viruses but It doesnt means its very dangerous to code. While there are more than 137 thousand python packages, it is very difficult to download 4 or 5 infected packages. As long as I know that packages were working like RAT programs. They are stealing browser datas, Discord recovery codes, files from fs etc. Also there is risks exist on JS too. As long as I know there is a lot of npm packages found that contains malicious code.

While the world is a cruel place, there is always the instinct of self-preservation, and at the same time those who are afraid of committing crimes in real life will create viruses by writing and stealing a little code in the virtual environment.

So the solution is prefering open-source, popular packages for projects

astyle · Aug 2, 2021

hardworkingnewbie said:
Well you don't have to use Javascript for such examples. Any programming language, where for some tasks is a standard library established well enough so it's practically everywhere but not many have a look at its internal status quo, and only few people actually have the knowledge to dive even deeper to be able to judge if the algorithms are implemented correctly or not will fit that bill.

For me the prime example is OpenSSL, when the heartbleed exploit was being discovered. This lead to the LibreSSL fork, and boy oh boy, what they did found in the source code was disgusting.

Old stuff... OpenSSL has actually been patched for Heartbleed already. The Wikipedia page actually points out that Heartbleed first appeared in 2012 (when OpenSSL was on version 1.0.1). Big Tech companies have been surprisingly slow to patch it up, but by mid-2014 Heartbleed became a big enough problem to finally be acknowledged as something that requires attention and a fix. FreeBSD updated its security/openssl OpenSSL implementation to 1.0.2 back in 2015, which corresponds to 10.1-RELEASE support cycle.

hardworkingnewbie · Aug 2, 2021

That was not my point; my point was that you can include bad libraries everywhere, and even well known might packages like OpenSSL might be just that - bad, astyle.

kpedersen · Aug 2, 2021

hardworkingnewbie said:
That was not my point; my point was that you can include bad libraries everywhere, and even well known might packages like OpenSSL might be just that - bad, astyle.

At least with C or C++ you include one library, i.e OpenSSL. With many other languages you will need OpenSSL *and* countless other libraries providing the non-native binding layers, frameworks, etc.

The chance that one single library is bad is quite low. However once a solution drags in loads of cruft, this chance raises considerably.

astyle · Aug 2, 2021

Crivens said:
The problem is worst when the code loads its dependecies on its own. You have no way to validate a build. Now for that safety critical stuff, like in medicine...

Crivens : Sorry, but I'd like to challenge you for some concrete examples here. Are you talking about runtime deps? the FreeBSD pkg and ports systems are actually pretty good about limiting that to registered deps.

FWIW, when I compile lang/rust, I notice that the source tarball isn't large, but make pulls in a truckload of deps from crates.io... But I honestly look at that as 'reinventing the wheel'. For comparison, devel/llvm bundles nearly everything it needs into one fairly large source tarball, that takes forever to compile, but doesn't need to pull in a truckload of deps that are frankly a re-implementation of standard UNIX utilities.

mark_j · Aug 2, 2021

hardworkingnewbie said:
That was not my point; my point was that you can include bad libraries everywhere, and even well known might packages like OpenSSL might be just that - bad, astyle.

I think the differentiation between poor code and nefarious code needs to be made, as well as the trust placed in repositories such as the one at issue here.

You will always have bugs, even in God's own language, Rust

, but code designed to exploit your system is a very different beast.

The true problem is trusting these repositories, from Perl to Python but also for the seemingly blind pull-in of code to fulfill a purpose; I'm looking at you javascript programmers (just look at an average web page and the s%#t it loads from various javascript sites. It's a disaster waiting to happen).

Even in ports, many grab stuff from some individual's web site to build from. Has that been hacked? Even sourceforge was hacked a few years back, but they said don't worry

astyle · Aug 2, 2021

mark_j said:
I think the differentiation between poor code and nefarious code needs to be made

Think that's easy? Stuxnet was nefarious, but it took a team of Kaspersky-funded researchers a few days to even figure out that it was targeting nuclear reactors in Iran. And a PBS Nova documentary on the topic pointed out that the compiled code, when reverse-engineered, was discovered to have a goal of making some centrifuges go faster. That's the same stuff that makes fans on a graphics card go faster or slower. Even Stuxnet was frankly a gamble. It could have hacked the smart blender in my kitchen and made strawberry smoothies for me.

mark_j · Aug 2, 2021

Source code.

Another reason not to use python?

mark_j

Python developers are being targeted with malicious packages on PyPI

Software downloaded 30,000 times from PyPI ransacked developers’ machines

a6h

drhowarddrfine

Tieks

hardworkingnewbie

mark_j

Alain De Vos

Tieks

Crivens

Administrator

kpedersen

Alain De Vos

kpedersen

Alain De Vos

astyle

mark_j

hardworkingnewbie

Crivens

Administrator

fbsd_

astyle

hardworkingnewbie

kpedersen

astyle

mark_j

astyle

mark_j