33 points

This is the best summary I could come up with:


In 2020, scientists decided just to rework the alphanumeric symbols they used to represent genes rather than try to deal with an Excel feature that was interpreting their names as dates and (un)helpfully reformatting them automatically.

Yesterday, a member of the Excel team posted that the company is rolling out an update on Windows and macOS to fix that.

Excel’s automatic conversions are intended to make it easier and faster to input certain types of commonly entered data — numbers and dates, for instance.

But for scientists using quick shorthand to make things legible, it could ruin published, peer-reviewed data, as a 2016 study found.

Microsoft detailed the update in a blog post this week, adding a checkbox labeled “Convert continuous letters and numbers to a date.” You can probably guess what that toggles.

The update builds on the Automatic Data Conversions settings the company added last year, which included the option for Excel to warn you when it’s about to get extra helpful and let you load your file without automatic conversion so you can ensure nothing will be screwed up by it.


The original article contains 225 words, the summary contains 184 words. Saved 18%. I’m a bot and I’m open source!

permalink
report
reply
-42 points

Why are scientists using a paid service such as Excel anyway? Shouldn’t they be using something like Libre Open Office?

permalink
report
parent
reply
42 points

Many scientists are based out of corporations or universities who contract with Microsoft, so Excel would be the default solution for working with spreadsheets.

Also, when it comes to “office” applications, there is no real substitute for Excel. Word processing, presentations, email, notes; there are many open and closed source alternatives that will do the same if not better than MS Office applications. Excel, however, is the exception.

LibreOffice Calc, G-Sheets, Apple’s Numbers, or the myriad of competitor office solutions have never matched Excel for in-depth analyses or overall function. For just basic features, one could limp by with most alternatives, but doing real analytical work within spreadsheets requires Excel.

permalink
report
parent
reply
13 points

“Real analytical work” shouldn’t be done in spreadsheets at all. You should use a database. Basic spreadsheet features are all you should ever use spreadsheet software to do anyway.

permalink
report
parent
reply
-8 points
*

No one does real analytical work with excel… If one is using excel, they are doing basic analytical work that can be done pretty much by every spreadsheet software.

It is just habit. People are used to excel, and are not competent enough to use more advanced tools to do real analytical work. And that’s fine. If one is good in a lab doesn’t necessarily need be good in data science

permalink
report
parent
reply
-9 points
*

“Real analytical work” is the ultimate power-tools-injure scenario with excel, and that’s why this article exists.

Programmers using actual databases and crafting custom analysis do not have this problem. There is a time and a place for excel, and this ain’t it; leave it to secretaries and people trying to copy data into word documents. I like a pivot table as much as the next guy, but JFC, learn to program, learn git, write in latex, publish science.

permalink
report
parent
reply
2 points

Gnumeric is superior for numerical evaluation.

Also any analysis on scale will use some proper programming language often in C or Fortran since Excel is simply far too slow.

permalink
report
parent
reply
3 points

Why should they do that?

permalink
report
parent
reply
1 point

In science, it is important to have verifiable and replicable results. This means everything you use - from ingredients to software - should be transparent. We can’t examine Excel’s source code, so we don’t know if it is working as it claims to be. Most scientific disciplines are moving towards open source, open access etc., and you can’t use Excel in fields like physics or mathematical biology. But molecular biology is a bit of a holdout.

permalink
report
parent
reply
13 points
*

By experience, being a scientist doesn’t mean one is the smartest guy in the room. Just that one has passion and luck and luxury to pursue that passion.

Many use alternatives to excel (R, python, Matlab, libreoffice).

For others installing a software is challenging enough that they use whatever provided by IT.

The remaining don’t give a sh*it, they are too busy in exploiting or in being exploited. No time to think about what is better

permalink
report
parent
reply
5 points

I’ve had the same copy of excel since high school, and it’s done a damn fine job processing experimental date through undergrad, my PhD, and 6 years as a working researcher.

It’s also the software pretty much everyone has, so you can easily share data with collaborators and other researchers. And it has a ton of functionality so you can process and analyze data easily, and create the visuals for papers very easily.

permalink
report
parent
reply
16 points
*

In college a professor gave us some homework to be done in excel, and as the nerd that I am, I asked if Livre Office was ok because I use Linux and have no access to Excel. The professor was like, well in that case everyone do the homework on R or python. My classmates were really mad at me for that.

permalink
report
parent
reply
4 points

You are completely right, and the Open Science movement is catching on. The idea is to give everyone access to the (anonymised) data and use only tools that are freely accessible, even to scientists from developing countries without Microsoft licenses, so that they too can rerun your analyses and verify your results. You shouldn’t be getting downvoted.

permalink
report
parent
reply
51 points

Thank god! You have no idea how awful this is for scientists. Need to paste some gene names down? Better hope it’s not MARCHF8 or in the Septin gene family, otherwise you have to convert columns to text then import the data. Seems like a simple fix, but many wet lab biologists are technologically challenged.

permalink
report
reply
116 points

Me before reading the article: It’s got to be dates. Excel thinks everything is a date.

Me after reading the article: Even the workaround is halfhearted. Jeebus.

permalink
report
reply
6 points

Apart from actual dates.

permalink
report
parent
reply
13 points

Microsoft’s blog adds caveats, such as that Excel avoids the conversion by saving the data as text, which means the data may not work for calculations later. There’s also a known issue where you can’t disable the conversions when running macros.

permalink
report
parent
reply
178 points

How about just not auto-convert everything and keep the integrity of the data unless specifically asked to? Is that so hard?

permalink
report
reply
113 points
*

Microsoft assumes their users are complete idiots, even when they (the users) are actively trying to convince them (Microsoft) otherwise. No matter how advanced the feature may be, they’ll assume you found instructions somewhere to do something entirely unrelated and they constantly have to save you from yourself. As a result you constantly have to fight the OS for access and control to get it to do what you want.
If you’re even a bit of a power user that is, of course.

But more often than not Microsoft’s assumption is probably spot on.

permalink
report
parent
reply
8 points

That assumption is perfectly good for a default. Not a mandatory feature that power users have to live with.

permalink
report
parent
reply
3 points
*

As a default, sure. Should be one that’s easily changed, though. Repeatedly fighting the machine that’s supposed to do your bidding and make your life easier gets old rather quickly. A machine you own and administrate, let’s not forget that.

permalink
report
parent
reply
24 points

Excel is inherently flawed in its design.

The thing is, that excel already has half the means of what would be necessary to really fix this bug. That is a field for each cell where the original text can stay.

An excel sheet is just a bunch of XML files zipped in a specific structure. You can unpack a file and look for yourself.
Each worksheet is it’s own file and each cell is subdivided into the value and the formula, that generated this value (or nothing, if there is no formula).
Excel could easily fix this issue by adding another possible cell attribute like “original” or “plain” that, when set, allows you to roll back any conversion.

But no, they go a half assed way as always and screw up even more.

permalink
report
parent
reply
15 points

In order to do that I think they would first have to ratify a standards change to the Excel format, which is open.

permalink
report
parent
reply
1 point

Uh, I mean kinda…

Excel implements two Microsoft file format standards:

  • ECMA-376
  • ISO 29500

Those are not the same and even incompatible in parts. It is correct, that Microsoft tries to use ISO 29500 more, but most files (2007) still are ECMA-376.

But yes, they kinda would have to change their shitty, ISO-incompatible ISO “standard” to fix this issue this way.

Or use the formula field, idk. 😅

permalink
report
parent
reply
2 points

Excel is never ever going to break backwards compatability. In fact, quite some “features” in Excel are just there to stay bug-for-bug compatible with existing systems.

Example: Excel stores dates internally as a float - called the serial date, you can view it by running DATEVALUE on any cell that contains a date. It is supposed to be the number of days since 1 January 1900. However, since early Excel versions had to be compatible with Lotus1-2-3, Excel had to be compatible with a bug in Lotus123: they had erroneously assumed 1900 to be a leap year. In addition, the indexing is off by one. So the actual 0 epoch of an Excel serial date is 30 December 1899 for all dates starting 1 March 1900.

permalink
report
parent
reply
1 point
Deleted by creator
permalink
report
parent
reply
16 points
*

It’s too late though, scientists already had to rename the genes. Although of course there are other things that can trigger it, not just in science.

permalink
report
reply

Technology

!technology@lemmy.world

Create post

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


Community stats

  • 16K

    Monthly active users

  • 12K

    Posts

  • 553K

    Comments