Here is the syntax of Ruby in pseudo BNF. For more detail, see parse.y in Ruby distribution.

下面是Ruby的BNF语法描述,更多细节,参考Ruby分发包中的parse.y文件。

BNF是语言的形式化描述,是最能体现设计和品味的地方,同样也是yacc的输入文件。之前,个人曾经想要自动生成一个c–的编译器,结果一直卡在语法解释器上,也不知道如何处理,最后,由于当时还有其他的事情想做,就放弃了。

PROGRAM		: COMPSTMT

COMPSTMT	: STMT (TERM EXPR)* [TERM]

STMT		: CALL do [`|' [BLOCK_VAR] `|'] COMPSTMT end
                | undef FNAME
		| alias FNAME FNAME
		| STMT if EXPR
		| STMT while EXPR
		| STMT unless EXPR
		| STMT until EXPR
                | `BEGIN' `{' COMPSTMT `}'
                | `END' `{' COMPSTMT `}'
                | LHS `=' COMMAND [do [`|' [BLOCK_VAR] `|'] COMPSTMT end]
		| EXPR

EXPR		: MLHS `=' MRHS
		| return CALL_ARGS
		| yield CALL_ARGS
		| EXPR and EXPR
		| EXPR or EXPR
		| not EXPR
		| COMMAND
		| `!' COMMAND
		| ARG

CALL		: FUNCTION
                | COMMAND

COMMAND		: OPERATION CALL_ARGS
		| PRIMARY `.' OPERATION CALL_ARGS
		| PRIMARY `::' OPERATION CALL_ARGS
		| super CALL_ARGS

FUNCTION        : OPERATION [`(' [CALL_ARGS] `)']
		| PRIMARY `.' OPERATION `(' [CALL_ARGS] `)'
		| PRIMARY `::' OPERATION `(' [CALL_ARGS] `)'
		| PRIMARY `.' OPERATION
		| PRIMARY `::' OPERATION
		| super `(' [CALL_ARGS] `)'
		| super

ARG		: LHS `=' ARG
		| LHS OP_ASGN ARG
		| ARG `..' ARG
		| ARG `...' ARG
		| ARG `+' ARG
		| ARG `-' ARG
		| ARG `*' ARG
		| ARG `/' ARG
		| ARG `%' ARG
		| ARG `**' ARG
		| `+' ARG
		| `-' ARG
		| ARG `|' ARG
		| ARG `^' ARG
		| ARG `&' ARG
		| ARG `<=>' ARG
		| ARG `>' ARG
		| ARG `>=' ARG
		| ARG `<' ARG
		| ARG `<=' ARG
		| ARG `==' ARG
		| ARG `===' ARG
		| ARG `!=' ARG
		| ARG `=~' ARG
		| ARG `!~' ARG
		| `!' ARG
		| `~' ARG
		| ARG `<<' ARG
		| ARG `>>' ARG
		| ARG `&&' ARG
		| ARG `||' ARG
		| defined? ARG
		| PRIMARY

PRIMARY		: `(' COMPSTMT `)'
		| LITERAL
		| VARIABLE
		| PRIMARY `::' IDENTIFIER
		| `::' IDENTIFIER
		| PRIMARY `[' [ARGS] `]'
		| `[' [ARGS [`,']] `]'
		| `{' [(ARGS|ASSOCS) [`,']] `}'
		| return [`(' [CALL_ARGS] `)']
		| yield [`(' [CALL_ARGS] `)']
		| defined? `(' ARG `)'
                | FUNCTION
		| FUNCTION `{' [`|' [BLOCK_VAR] `|'] COMPSTMT `}'
		| if EXPR THEN
		  COMPSTMT
		  (elsif EXPR THEN COMPSTMT)*
		  [else COMPSTMT]
		  end
		| unless EXPR THEN
		  COMPSTMT
		  [else COMPSTMT]
		  end
		| while EXPR DO COMPSTMT end
		| until EXPR DO COMPSTMT end
		| case COMPSTMT
		  (when WHEN_ARGS THEN COMPSTMT)+
		  [else COMPSTMT]
		  end
		| for BLOCK_VAR in EXPR DO
		  COMPSTMT
		  end
		| begin
		  COMPSTMT
		  [rescue [ARGS] DO COMPSTMT]+
		  [else COMPSTMT]
		  [ensure COMPSTMT]
		  end
		| class IDENTIFIER [`<' IDENTIFIER]
		  COMPSTMT
		  end
		| module IDENTIFIER
		  COMPSTMT
		  end
		| def FNAME ARGDECL
		  COMPSTMT
		  end
		| def SINGLETON (`.'|`::') FNAME ARGDECL
		  COMPSTMT
		  end

WHEN_ARGS	: ARGS [`,' `*' ARG]
		| `*' ARG

THEN		: TERM
		| then
		| TERM then

DO		: TERM
		| do
		| TERM do

BLOCK_VAR	: LHS
		| MLHS

MLHS		: MLHS_ITEM `,' [MLHS_ITEM (`,' MLHS_ITEM)*] [`*' [LHS]]
                | `*' LHS

MLHS_ITEM	: LHS
		| '(' MLHS ')'

LHS		: VARIABLE
		| PRIMARY `[' [ARGS] `]'
		| PRIMARY `.' IDENTIFIER

MRHS		: ARGS [`,' `*' ARG]
		| `*' ARG

CALL_ARGS	: ARGS
		| ARGS [`,' ASSOCS] [`,' `*' ARG] [`,' `&' ARG]
		| ASSOCS [`,' `*' ARG] [`,' `&' ARG]
		| `*' ARG [`,' `&' ARG]
		| `&' ARG
		| COMMAND

ARGS 		: ARG (`,' ARG)*

ARGDECL		: `(' ARGLIST `)'
		| ARGLIST TERM

ARGLIST		: IDENTIFIER(`,'IDENTIFIER)*[`,'`*'[IDENTIFIER]][`,'`&'IDENTIFIER]
		| `*'IDENTIFIER[`,'`&'IDENTIFIER]
		| [`&'IDENTIFIER]

SINGLETON	: VARIABLE
		| `(' EXPR `)'

ASSOCS		: ASSOC (`,' ASSOC)*

ASSOC		: ARG `=>' ARG

VARIABLE	: VARNAME
		| nil
		| self

LITERAL		: numeric
		| SYMBOL
		| STRING
		| STRING2
		| HERE_DOC
		| REGEXP

TERM		: `;'
		| `\n'

The followings are recognized by lexical analizer.


OP_ASGN		: `+=' | `-=' | `*=' | `/=' | `%=' | `**='
		| `&=' | `|=' | `^=' | `<<=' | `>>='
		| `&&=' | `||='

SYMBOL		: `:'FNAME
		| `:'VARNAME

FNAME		: IDENTIFIER | `..' | `|' | `^' | `&'
		| `<=>' | `==' | `===' | `=~'
                | `>' | `>=' | `<' | `<='
		| `+' | `-' | `*' | `/' | `%' | `**'
		| `<<' | `>>' | `~'
                | `+@' | `-@' | `[]' | `[]='

OPERATION       : IDENTIFIER
                | IDENTIFIER'!'
                | IDENTIFIER'?'

VARNAME		: GLOBAL
		| `@'IDENTIFIER
		| IDENTIFIER

GLOBAL		: `$'IDENTIFIER
		| `$'any_char
		| `$''-'any_char

STRING		: `"' any_char* `"'
		| `'' any_char* `''
		| ``' any_char* ``'

STRING2		: `%'(`Q'|`q'|`x')char any_char* char

HERE_DOC        : `<<'(IDENTIFIER|STRING)
                  any_char*
                  IDENTIFIER

REGEXP		: `/' any_char* `/'[`i'|`o'|`p']
		| `%'`r' char any_char* char

IDENTIFIER is the sqeunce of characters in the pattern of /[a-zA-Z][a-zA-Z0-9]*/.

IDENTIFIER是/[a-zA-Z][a-zA-Z0-9]*/模式的字符序列。