This EEP extends numeric literal syntax, allowing underscore _
as digits
separator. It extends syntax of integer, float and based integer literals.
This document has been placed in the public domain.
Current syntax for numeric literals can be described by following scheme:
DIGIT = [0-9]
BASE_DIGIT = [0-9a-zA-Z]
SIGN = [+-]
INTEGER = SIGN? DIGIT+
FLOAT = SIGN? DIGIT+ '.' DIGIT+ ('e' SIGN? DIGIT+)?
BASED_INTEGER = SIGN? DIGIT+ '#' BASE_DIGIT+
EEP extends this syntax by allowing single underscore _
character as a
separator between two DIGIT
or BASE_DIGIT
characters. I.e., characters from
both left and right sides of _
should be digits. Leading, trailing or repeated
underscores should not be allowed as well as underscores next to -
.
#
e
.
Separator is only used to visually separate groups of characters. It doesn’t
change semantics of literal and is removed by lexer.
Following are examples of valid numeric literals:
123_456
1_2_3_4_5
123_456.789_123
1.0e1_23
16#DEAD_BEEF
2#1100_1010_0011
And following are examples of literals that are not allowed:
_123 % will be interpreted as variable name
123_
123__456 % only single underscore allowed
123_.456 % underscore can only separate digits, not other symbols
123._456
16#_1234
16#1234_
The underscores are to be lost when representing abstract forms or terms as
strings (erl_pp
, io_lib
).
Digit separator becomes especially handy for relatively long literals like
thousands separator for decimals Million = 1_000_000
, SecondsInDay = 86_400
byte separator for hexadecimal 16#DEAD_BEEF_CAFE
. Group separator makes
such literals more readable (when used wisely) and makes it easier to visually
spot typos like MicroSecond = Second * 100000
.
The digit separator feature was adopted by multiple programming languages
starting from at least Ada83.
Underscore character was chosen as separator because it is used by multiple
other languages such as “Ada 83”, “C#”, “Java 7”, “Perl 2.0”, “Python 3.6”,
“Ruby”, “Rust”, “Swift” and “Go”. Also it doesn’t conflict with current Erlang
syntax (unlike, for example, ,
).
The only ambiguous decision was about where exactly digit separators should be
allowed. It was decided that underscore should be allowed only between digits
as meant by being a “digit separator”.
All existing Erlang code remains acceptable. The implementation is done entirely in the Erlang lexer, so, no changes are needed in parser and even tools that examine ASTs will be unaffected. External tools like code editors and syntax highlighters might need to be updated to be able to recognize new syntax for numeric literals.
The reference implementation is provided in a form of pull-request on GitHub
https://github.com/erlang/otp/pull/2324