Microsoft Excel blamed for gene name errors in the scientific literature

Reading time icon 2 min. read


Readers help support MSPoweruser. When you make a purchase using links on our site, we may earn an affiliate commission. Tooltip Icon

Read the affiliate disclosure page to find out how can you help MSPoweruser effortlessly and without spending any money. Read more

Excel

Microsoft Excel is the world’s most popular spreadsheet software and its usage spans across industries. A new research study published on Genome Biology claims that Excel auto-correct issues have affected approximately one-fifth of genomics journal papers. When used with default settings, Excel converts gene names to dates and floating-point numbers.

For example, gene symbols such as SEPT2 (Septin 2) and MARCH1 [Membrane-Associated Ring Finger (C3HC4) 1, E3 Ubiquitin Protein Ligase] are converted by default to ‘2-Sep’ and ‘1-Mar’, respectively. Furthermore, RIKEN identifiers were described to be automatically converted to floating point numbers (i.e. from accession ‘2310009E13’ to ‘2.31E+13’).

Actually, this is not a newly discovered issue. The issue of Excel inadvertently converting gene symbols to dates and floating-point numbers was originally described in 2004. Since most common users will expect Excel to auto-correct SEP2 to 2-Sep, Microsoft has decided not to change its behaviour. But the gene symbol conversion is problematic because these files are an important resource in the genomics community that are frequently reused. This study screened 35,175 supplementary Excel files, finding 7467 gene lists attached to 3597 published papers. They confirmed gene name errors in 987 supplementary files from 704 published articles.

It seems there is no direct way to permanently deactivate automatic formatting of dates in Excel and this issue also occurs in other popular spreadsheet programs such as LibreOffice Calc or Apache OpenOffice Calc. This research study was conducted to raise awareness of this problem among genomics academic community.

Read the full report here.

More about the topics: Auto Correct, Excel, Gene Names, microsoft, scientific literature