Me reading this article: yup, yup, yup, yup, wait a minute!
This part feels a bit snuck in, in a "leading and pacing" kind of way: All the other points are long-established no brainers, but this one is still controversially discussed and I'd say - in the general form it's presented here - wrong.
The author is still right that it's wrong to categorically dismiss AI tools when coding. But you'd have to apply a lot more caution to this point than to the others.
I hope sneaking this in wasn't the real motivation of the article.
Alternative for rule 1 - give descriptive names to your variables, but then just reuse them throughout the function for all kinds values and purposes. The longer the function the better.
I once was in the position of becoming pr approver[1] for a team of outsourced python programmers who were under some pretty extreme deadline pressure. Anyway one day a PR comes in and I can’t help but notice it was doing a string eval.
Weird. You almost never need to do string eval in python, and whenever there is something where you think you need eval there is a better and safer way to achieve the same result.
Also, I was bending my brain but I couldn’t really figure out what this eval was for until I wrote out some scratch code myself to figure it out.
Turns out this 5 lines or so was constructing a string to do dict lookup and then evalling that. So say you have a dict d = {‘foo’: ‘bar’} and you have a variable i=foo and want to look up d[i], instead of just doing that it was doing something like
eval(‘d[‘+i+’]’)
Just no.
So I rejected the change and they came back with “but we’ve always done it that way”. I grep the codebase and yes. There were about 200+ uses of eval, all of which were constructing a string to look something up in a dictionary and then evaling the result. Some person who clearly didn’t program in python had found this twisted way to look things up in a dictionary and then this team had dutifully copied and pasted it throughout the codebase.
[1] ie I wasn’t there from the start of the project
I saw plenty of similar things from outsourced teams.
For instance: An e-commerce API that used JSON. Not only did the spec. tell them to use integers with pence for prices, but it explicitly called out that it MUST NOT use floating points with pounds. Sure enough, they implemented it as floating point pounds. So we asked them to fix this.
The underlying datatype in the database was pounds in a decimal type. You would think that they would multiply this by 100 and call it a day. What they actually did was: a) render it as a string, b) strip the period character, and c) parse the resulting string as an integer. They didn’t test this properly before deploying, which resulted in us charging the correct price for things that cost £xxx.x5 but undercharging by a factor of ten for things that cost £xxx.x0 and undercharging by a factor of a hundred for things that cost £xxx.00.
> So I rejected the change and they came back with “but we’ve always done it that way”. I grep the codebase and yes. There were about 200+ uses of eval
That's code review's worst! Happened to me many times.
I wonder whether my dudes cut and pasted from the same cursed stack overflow snippet as your dudes had.
The strings here included user input too. Worse still, the situation was the company was offering a b2b service and the string didn’t just come from user input by an employee of the company they came from arbitrary customers of the customers of the company.
In my case the data was visible in the URL - they had chosen to not store use session specific data in the DB or cookie or anything sane like that, but to pass it to the page in the URL path by converting a dict to a string/
Git blame shows the same thing done in two different places and the line edited by at least two different people.
Double pro tip for naming identifiers: Overwrite built ins; dir, list, len, file, those are all beautiful identifiers and debugging the resulting bugs is twice the fun
Hey now, at least the dictionary has keys that _could_ hint at the contents (or be completely misleading). What about the tuple with just positions?
image.size
ndarray.shape
Are the image sizes (width, height) or (height, width)?
Trick question of course, it's (height, width, channels) for numpy. numpy is fairly well known though and sort of gets away with it, but when your never-seen-before internal company starts doing this, well...
Serious question, but what is the best practice for keeping n-dimensional arrays organized/labeled? Pandas? Xarray? Converting everything to netcdf before using it?
It's not just Python. You see this a lot in Javascript and Groovy too.
Really I think what makes languages with manual memory management and limited built in types like C nice is it forces people to really think about the types and interfaces they're writing instead of just hacking unmaintainable code at 20 lines an hour.
Seen (and fixed): A function returns some “struct” (as a dict), with a couple of fields, mostly stringly typed, and one field for accumulating error messages called “errors”. The dict was created as a defaultdict(list) (probably) to make it easier to append messages.
oh but theres so much more worse xD... but i guess that goes in general for code. fun read tho and with reverse psychology u can maybe learn something as a novice snake wrangler xD
> Rule #13: AI-Assisted Coding is for the Weak
Me reading this article: yup, yup, yup, yup, wait a minute!
This part feels a bit snuck in, in a "leading and pacing" kind of way: All the other points are long-established no brainers, but this one is still controversially discussed and I'd say - in the general form it's presented here - wrong.
The author is still right that it's wrong to categorically dismiss AI tools when coding. But you'd have to apply a lot more caution to this point than to the others.
I hope sneaking this in wasn't the real motivation of the article.
Jokes on you, the article is ai written
busted
Classic listicle trick: All the obvious ones and one random controversial hot take to get the engagement going.
Alternative for rule 1 - give descriptive names to your variables, but then just reuse them throughout the function for all kinds values and purposes. The longer the function the better.
Example
Really fun to debug.I once was in the position of becoming pr approver[1] for a team of outsourced python programmers who were under some pretty extreme deadline pressure. Anyway one day a PR comes in and I can’t help but notice it was doing a string eval.
Weird. You almost never need to do string eval in python, and whenever there is something where you think you need eval there is a better and safer way to achieve the same result.
Also, I was bending my brain but I couldn’t really figure out what this eval was for until I wrote out some scratch code myself to figure it out.
Turns out this 5 lines or so was constructing a string to do dict lookup and then evalling that. So say you have a dict d = {‘foo’: ‘bar’} and you have a variable i=foo and want to look up d[i], instead of just doing that it was doing something like
Just no.So I rejected the change and they came back with “but we’ve always done it that way”. I grep the codebase and yes. There were about 200+ uses of eval, all of which were constructing a string to look something up in a dictionary and then evaling the result. Some person who clearly didn’t program in python had found this twisted way to look things up in a dictionary and then this team had dutifully copied and pasted it throughout the codebase.
[1] ie I wasn’t there from the start of the project
I saw plenty of similar things from outsourced teams.
For instance: An e-commerce API that used JSON. Not only did the spec. tell them to use integers with pence for prices, but it explicitly called out that it MUST NOT use floating points with pounds. Sure enough, they implemented it as floating point pounds. So we asked them to fix this.
The underlying datatype in the database was pounds in a decimal type. You would think that they would multiply this by 100 and call it a day. What they actually did was: a) render it as a string, b) strip the period character, and c) parse the resulting string as an integer. They didn’t test this properly before deploying, which resulted in us charging the correct price for things that cost £xxx.x5 but undercharging by a factor of ten for things that cost £xxx.x0 and undercharging by a factor of a hundred for things that cost £xxx.00.
> So I rejected the change and they came back with “but we’ve always done it that way”. I grep the codebase and yes. There were about 200+ uses of eval
That's code review's worst! Happened to me many times.
I came across something quite similar. Using eval to convert a string to a dict. The string potentially included user input.
I wonder whether my dudes cut and pasted from the same cursed stack overflow snippet as your dudes had.
The strings here included user input too. Worse still, the situation was the company was offering a b2b service and the string didn’t just come from user input by an employee of the company they came from arbitrary customers of the customers of the company.
In my case the data was visible in the URL - they had chosen to not store use session specific data in the DB or cookie or anything sane like that, but to pass it to the page in the URL path by converting a dict to a string/
Git blame shows the same thing done in two different places and the line edited by at least two different people.
Yes! India? There is a very big follow the leader culture.
Rhymes with apocryphal monkey-ladder story.
Double pro tip for naming identifiers: Overwrite built ins; dir, list, len, file, those are all beautiful identifiers and debugging the resulting bugs is twice the fun
Well hey there chatgpt! I recognise your writing style anywhere — it's like a bad metaphor where you end up saying "what?".
Disappointed that my favorite pythonic pattern of "here's a dict with unknown contents, do the right thing with it" is not listed.
Hey now, at least the dictionary has keys that _could_ hint at the contents (or be completely misleading). What about the tuple with just positions?
Are the image sizes (width, height) or (height, width)?Trick question of course, it's (height, width, channels) for numpy. numpy is fairly well known though and sort of gets away with it, but when your never-seen-before internal company starts doing this, well...
Serious question, but what is the best practice for keeping n-dimensional arrays organized/labeled? Pandas? Xarray? Converting everything to netcdf before using it?
It's not just Python. You see this a lot in Javascript and Groovy too.
Really I think what makes languages with manual memory management and limited built in types like C nice is it forces people to really think about the types and interfaces they're writing instead of just hacking unmaintainable code at 20 lines an hour.
Seen (and fixed): A function returns some “struct” (as a dict), with a couple of fields, mostly stringly typed, and one field for accumulating error messages called “errors”. The dict was created as a defaultdict(list) (probably) to make it easier to append messages.
haha, i should add this.
What do you mean, "unknown contents"?
Just trace back the dict to it's origin, through all the places it can be modified, and you know what is (or could be) inside...
/s
> abandoned packages with no recent commits
So, stable packages?
Flip a coin, you might get lucky, and that’s how all great applications are built.
I am teaching python to a group of beginner programmers. This will be shared!
Caveat #13
oh but theres so much more worse xD... but i guess that goes in general for code. fun read tho and with reverse psychology u can maybe learn something as a novice snake wrangler xD
thanks :)