Microsoft Excel is an incredibly powerful program that’s just as essential in a laboratory as it is in the average office. However, scientists have had just about enough of Excel renaming genes as if they were dates. It didn’t seem likely anyone would convince Microsoft to change the way Excel works, so scientists got together recently and renamed the genes.
Anyone who has spent time tinkering with data in Excel knows the pain of entering data by hand only to find out that Excel has formatted it in a way that makes it less useful. Scientists regularly use Excel to track work and do basic analysis on data before exporting it to more sophisticated tools. There’s no “DNA” formatting option, though, and that’s caused frustration among researchers.
Genes are simply sequences of nucleic acid (DNA in humans) that code for a protein. For example, the GTPases protein known as Spetin-1 is coded by a gene called SEPT1. You can probably see where this is going. If you enter “SEPT1” in Excel, it will immediately change the cell to read “1-Sept” because it thinks you’re trying to enter a date. To make matters worse, there is no way to disable this automatic reformatting. You have to change the cell formatting in each spreadsheet manually. Failing to do so can lead to corrupted data and wasted time. Not everyone is an expert in Excel, so mistakes were common. (Note: Hitting ‘ in front of a data field in Excel will tell the cell to format as text, but there is no way to set this as default).
The scientific body that controls gene naming has now stepped in to set things right. The HUGO Gene Nomenclature Committee (HGNC) has thus far renamed 27 genes to ensure Excel doesn’t butcher their names. For example, the gene MARCH1 codes for a protein called Membrane Associated Ring-CH-Type Finger 1. HGNC has renamed that gene to MARCHF1 to avoid confusion. SEPT1 is now SEPTIN1. The HGNC will keep a record of the changes so there’s no confusion in the future when researchers read materials with the old names.
The HGNC has even published guidelines to help scientists name (and rename) genes, which makes things more efficient but also a bit less fun — there probably won’t be any new “sonic hedgehog” genes. The guidelines cover more than Excel screw-ups. They also recommend names that avoid pejoratives (eg. DOPEY1 renamed to DOPIA) and those based on disease names. CASC4 was named as a CAncer Susceptibility Candidate 4, and now it’s GOLM2 (golgi membrane protein 2).
Even if gene names will be less of a creative outlet, most scientists have expressed relief that there is more clarity. No word yet on whether the “sonic hedgehog” gene (SHH) or “Pikachurin” (named for the eponymous Pokemon) are also being renamed.