In Java, variable names start with a letter, currency character ($) etc. but not with number, :, or .
Simple question: why is that?
Why doesn't the compiler allow to have variable declarations such as
int 7dfs;
Simply put, it would break facets of the language grammar.
For example, would 7f be a variable name, or a floating point literal with a value of 7?
You can conjure others too: if . was allowed then that would clash with the member selection operator: would foo.bar be an identifier in its own right, or would it be the bar field of an object instance foo?
Because the Java Language specification says so:
IdentifierChars:
JavaLetter {JavaLetterOrDigit}
So - yes, an identifier must start with a letter; it can't start with a digit.
The main reasons behind that:
it is simply what most people expect
it makes parsing source code (much) easier when you restrict the "layout" of identifiers; for example it reduces the possible ambiguities between literals and variable names.
Related
I was just wondering if I am breaking some sort of naming convention while naming my variables.
Let's say I have an int named numerator1. Is it wrong to name the double type conversion variable as doubleNumerator1?
Except for variables, all instance, class, and class constants are in mixed case with a lowercase first letter. Internal words start with capital letters. Variable names should not start with underscore _ or dollar sign $ characters, even though both are allowed.
Variable names should be short yet meaningful. The choice of a variable name should be mnemonic- that is, designed to indicate to the casual observer the intent of its use. One-character variable names should be avoided except for temporary "throwaway" variables. Common names for temporary variables are i, j, k, m, and n for integers; c, d, and e for characters.
This is the Naming convention for variables from Oracle.
So technically no. But variable name numerator1 is not useful for most cases since no one reading would know what that name signifies. Rule of thumb for naming variable for me is if it's being used immediately and its use is clear I can name it something short. But if I have to use the variable for more than once or it gets used after more than 20 lines, I would name it something more meaningful.
Reading the Java Code Conventions document from 1997, I saw this in an example on P16 about variable naming conventions:
int i;
char *cp;
float myWidth;
The second declaration is of interest - to me it looks a lot like how you might declare a pointer in C. It gives a syntax error when compiling under Java 8.
Just out of curiosity: was this ever valid syntax? If so, what did it mean?
It's a copy-paste error, I suppose.
From JLS 1 (which is really not that easy to find!), the section on local variable declarations states that such a declaration, in essence, is a type followed by an identifier. Note that there is no special reference made about *, but there is special reference made about [] (for arrays).
char is our type, so the only possibility that remains is that *cp is an identifier. The section on Identifiers states
An identifier is an unlimited-length sequence of Java letters and Java
digits, the first of which must be a Java letter.
...
A Java letter is a character for which the method Character.isJavaLetter (§20.5.17) returns true
And the JavaDoc for that method states:
A character is considered to be a Java letter if and only if it is a
letter (§20.5.15) or is the dollar sign character '$' (\u0024) or the
underscore ("low line") character '_' (\u005F).
so foo, _foo and $foo were fine, but *foo was never valid.
If you want a more up-to-date Java style guide, Google's style guide is the arguably the most commonly referenced.
It appears that this is a generic coding style document for C-like languages with some Java-specific additions. See, for example, also the next page:
Do not use the assignment operator in a place where it can be easily confused with the equality operator. Example:
if (c++ = d++) { // AVOID! Java disallows.
…
}
It does not make sense to tell a programmer to avoid something that is a syntax error anyway, so the only conclusion we can draw from this is that the document is not 100% Java-specific.
Another possibility is that it was meant as a coding style for the entire Java system, including the C++ parts of the JRE and JDK.
Note that Sun abandoned the coding style document even long before Oracle came into the picture. They restrained themselves to specifying what the language is, not how to use it.
Invalid syntax!
It's just a copy/paste mistake.
The Token (*) in variables is applicable only in C because it uses pointers whereas JAVA never uses pointers.
And Token (*) is used only as operator in JAVA.
What characters are valid in a Java class name? What other rules govern Java class names (for instance, Java class names cannot begin with a number)?
You can have almost any character, including most Unicode characters! The exact definition is in the Java Language Specification under section 3.8: Identifiers.
An identifier is an unlimited-length sequence of Java letters and Java digits, the first of which must be a Java letter. ...
Letters and digits may be drawn from the entire Unicode character set, ... This allows programmers to use identifiers in their programs that are written in their native languages.
An identifier cannot have the same spelling (Unicode character sequence) as a keyword (§3.9), boolean literal (§3.10.3), or the null literal (§3.10.7), or a compile-time error occurs.
However, see this question for whether or not you should do that.
Every programming language has its own set of rules and conventions for the kinds of names that you're allowed to use, and the Java programming language is no different. The rules and conventions for naming your variables can be summarized as follows:
Variable names are case-sensitive. A variable's name can be any legal identifier — an unlimited-length sequence of Unicode letters and digits, beginning with a letter, the dollar sign "$", or the underscore character "_". The convention, however, is to always begin your variable names with a letter, not "$" or "_". Additionally, the dollar sign character, by convention, is never used at all. You may find some situations where auto-generated names will contain the dollar sign, but your variable names should always avoid using it. A similar convention exists for the underscore character; while it's technically legal to begin your variable's name with "_", this practice is discouraged. White space is not permitted.
Subsequent characters may be letters, digits, dollar signs, or underscore characters. Conventions (and common sense) apply to this rule as well. When choosing a name for your variables, use full words instead of cryptic abbreviations. Doing so will make your code easier to read and understand. In many cases it will also make your code self-documenting; fields named cadence, speed, and gear, for example, are much more intuitive than abbreviated versions, such as s, c, and g. Also keep in mind that the name you choose must not be a keyword or reserved word.
If the name you choose consists of only one word, spell that word in all lowercase letters. If it consists of more than one word, capitalize the first letter of each subsequent word. The names gearRatio and currentGear are prime examples of this convention. If your variable stores a constant value, such as static final int NUM_GEARS = 6, the convention changes slightly, capitalizing every letter and separating subsequent words with the underscore character. By convention, the underscore character is never used elsewhere.
From the official Java Tutorial.
Further to previous answers its worth noting that:
Java allows any Unicode currency symbol in symbol names, so the following will all work:
$var1
£var2
€var3
I believe the usage of currency symbols originates in C/C++, where variables added to your code by the compiler conventionally started with '$'. An obvious example in Java is the names of '.class' files for inner classes, which by convention have the format 'Outer$Inner.class'
Many C# and C++ programmers adopt the convention of placing 'I' in front of interfaces (aka pure virtual classes in C++). This is not required, and hence not done, in Java because the implements keyword makes it very clear when something is an interface.
Compare:
class Employee : public IPayable //C++
with
class Employee : IPayable //C#
and
class Employee implements Payable //Java
Many projects use the convention of placing an underscore in front of field names, so that they can readily be distinguished from local variables and parameters e.g.
private double _salary;
A tiny minority place the underscore after the field name e.g.
private double salary_;
As already stated by Jason Cohen, the Java Language Specification defines what a legal identifier is in section 3.8:
"An identifier is an unlimited-length sequence of Java letters and Java digits, the
first of which must be a Java letter. [...] A 'Java letter' is a character for which the method Character.isJavaIdentifierStart(int) returns true. A 'Java letter-or-digit' is a character for which the method Character.isJavaIdentifierPart(int) returns true."
This hopefully answers your second question. Regarding your first question; I've been taught both by teachers and (as far as I can remember) Java compilers that a Java class name should be an identifier that begins with a capital letter A-Z, but I can't find any reliable source on this. When trying it out with OpenJDK there are no warnings when beginning class names with lower-case letters or even a $-sign. When using a $-sign, you do have to escape it if you compile from a bash shell, however.
I'd like to add to bosnic's answer that any valid currency character is legal for an identifier in Java. th€is is a legal identifier, as is €this, and € as well. However, I can't figure out how to edit his or her answer, so I am forced to post this trivial addition.
What other rules govern Java class names (for instance, Java class names cannot begin with a number)?
Java class names usually begin with a capital letter.
Java class names cannot begin with a number.
if there are multiple words in the class name like "MyClassName" each word should begin with a capital letter. eg- "MyClassName".This naming convention is based on CamelCase Type.
Class names should be nouns in UpperCamelCase, with the first letter of every word capitalised. Use whole words — avoid acronyms and abbreviations (unless the abbreviation is much more widely used than the long form, such as URL or HTML).
The naming conventions can be read over here:
http://www.oracle.com/technetwork/java/codeconventions-135099.html
Identifiers are used for class names, method names, and variable names. An identifiermay be any descriptive sequence of uppercase and lowercase letters, numbers, or theunderscore and dollar-sign characters. They must not begin with a number, lest they beconfused with a numeric literal. Again, Java is case-sensitive, so VALUE is a differentidentifier than Value.
Some examples of valid identifiers are:
AvgTemp ,count a4 ,$test ,this_is_ok
Invalid variable names include:
2count, high-temp, Not/ok
This is a question from a Java test I took at University
I. publicProtected
II. $_
III. _identi#ficador
I've. Protected
I'd say I, II, and I've are correct. What is the correct answer for this?
Source of the question in spanish: Teniendo la siguiente lista de identificadores de variables, ¿Cuál (es) es (son) válido (s)?
From the java documentation:
Variable names are case-sensitive. A variable's name can be any legal
identifier — an unlimited-length sequence of Unicode letters and
digits, beginning with a letter, the dollar sign "$", or the
underscore character "". The convention, however, is to always begin
your variable names with a letter, not "$" or "". Additionally, the
dollar sign character, by convention, is never used at all. You may
find some situations where auto-generated names will contain the
dollar sign, but your variable names should always avoid using it. A
similar convention exists for the underscore character; while it's
technically legal to begin your variable's name with "_", this
practice is discouraged. White space is not permitted. Subsequent
characters may be letters, digits, dollar signs, or underscore
characters. Conventions (and common sense) apply to this rule as well.
When choosing a name for your variables, use full words instead of
cryptic abbreviations. Doing so will make your code easier to read and
understand. In many cases it will also make your code
self-documenting; fields named cadence, speed, and gear, for example,
are much more intuitive than abbreviated versions, such as s, c, and
g. Also keep in mind that the name you choose must not be a keyword or
reserved word.
https://docs.oracle.com/javase/tutorial/java/nutsandbolts/variables.html
In short: yes, you're right. You can use underscores, dollarsigns, and characters to start a variable name. After the first letter of the variable name, you can also use numbers. Note that using dollar signs is generally not good practice.
From your comment, you said that your teacher rejected "II". Under your question, II is perfectly fine (try it, it will run). However, if the question on your test asked which are "good" variable names, or which variable names follow common practice, then II would be eliminated as explained in the quotation above. One reason for this is that dollar signs do not make readable variable names; they're included because internally Java makes variables that use the dollar sign.
What is the meaning of $ in a variable name?
As pointed out in the comments, IV is not a good name either, since the lower case version "protected" is a reserved keyword. With syntax highlighting, you probably wouldn't get the two confused, but using keyword-variations as variable names is certainly one way to confuse future readers
Private protected public are reserved or keywords in java.. Use _ or to use that those words.. example
int public_x;
int protected_x;
String private_s;
I can not think anything other than "string of digits would be a valid identifier as well as a valid number."
Is there any other explanation other than this one?
Because that would make telling number literals from symbols names a serious PITA.
For example with a digit being valid for the first character a variables of the names 0xdeadbeef or 0xc00lcafe were valid. But that could be interpreted as a hexadecimal number as well. By limiting the first character of a symbol to be a non-digit, ambiguities of that kind are avoided.
If it could then this assignment would be possible
int 33 = 44; // oh oh
then how would the JVM distinguish between a numeric literal and a variable?
It's to keep the rules simple for the compiler as well as for the programmer.
An identifier could be defined as any alphanumeric sequence that can not be interpreted as a number, but you would get into situations where the compiler would interpret the code differently from what you expect.
Example:
double 1e = 9;
double x = 1e-4;
The result in x would not be 5 but 0.0001 as 1e-4 is a number in scientific notation and not interpreted as 1e minus 4.
This is done in Java and in many other languages so that a parser could classify a terminal symbol uniquely regardless of its surrounding context. Technically, it is entirely possible to allow identifiers that look like numbers or even like keywords: for example, it is possible to write a parser that lifts the restriction on identifiers, allowing you to write something like this:
int 123 = 321; // 123 is an identifier in this imaginary compiler
The compiler knows enough to "understand" that whatever comes after the type name must be a variable name, so 123 is an identifier, and so it could treat this as a valid declaration. However, this would create more ambiguities down the road, because 123 becomes in invalid number "shadowed" by your new "identifier".
In the end, the rule works both ways: it helps compiler designers write simpler compilers, and it also helps programmers write readable code.
Note that there were attempts in the past to build compilers that are not particularly picky about names of identifiers - for example
int a real int = 3
would declare an identifier with spaces (i.e. "a real int" is a single identifier). This did not help readability, though, so modern compilers abandoned the trend.