CSV, Short URLs and Excel

CSV, Short URLs and Excel

2 Share

CSV, Short URLs and Excel

A short URL has a domain + a short hash (tiny.cc/0bza4). 0bza4 is the short hash in this example. Let's say you are importing a batch of  short hashes. Spreadsheets are compri ...

A short URL has a domain + a short hash (tiny.cc/0bza4). 0bza4 is the short hash in this example. Let’s say you are importing a batch of  short hashes.

Spreadsheets are comprised of two components, content and format (number, masking, coloring, etc.). Comma-separated value (CSV) is a file type that only holds content, while containing no format information. Lack of format means Excel needs to apply its own default formatting to it.

Excel will likely apply a number format and most number formats result in unexpected behavior when it comes to short hashes.

Things to Know About Excel’s Default Number Format for Cells

1. Removes leading zeros from number codes

Zeros at the beginning of any sequence are automatically removed from number codes. The reason for this is because Excel assumes that zero is not required to display the value of the number. This means any short hash beginning with 0 could unexpectedly have the 0 dropped. 0bza4 is changed into bza4.

2. “e” in number code triggers scientific notation

An “e” in a short hash can, depending on its position in the sequence, be interpreted by Excel as scientific notation. Excel thinks you meant scientific notation and formally converts it for you. It sees a number code inside a cell – like 10e22 for example and decides that it needs some tidying up. So unexpectedly the short hash might be replaced with 1E+23.

3. Rounds off numbers

It is unlikely that your short hashes would ever contain enough characters to trigger rounding, but note that Excel has a precision limit and will round off and truncate number codes that exceeds its limit.

Change the Format or Import as XSL or ODS

If you have to import CSV, be sure to pre-format your short hash column to text or custom number that maintains all original characters and digits. Another hack that avoids auto correcting to scientific notation is to insert an apostrophe before the hash.

The best method if you have options is to work with file types that have both content and format information. This means importing as XLS or ODS instead of CSV. With cell format set to string in original file, the import will always be predictable.