Package google :: Package protobuf :: Module text_format :: Class Tokenizer
[hide private]
[frames] | no frames]

Class Tokenizer

source code

object --+
         |
        Tokenizer

Protocol buffer text representation tokenizer.

This class handles the lower level string parsing by splitting it into meaningful tokens.

It was directly ported from the Java protocol buffer API.

Instance Methods [hide private]
 
__init__(self, lines, skip_comments=True)
x.__init__(...) initializes x; see help(type(x)) for signature
source code
 
LookingAt(self, token) source code
 
AtEnd(self)
Checks the end of the text was reached.
source code
 
_PopLine(self) source code
 
_SkipWhitespace(self) source code
 
TryConsume(self, token)
Tries to consume a given piece of text.
source code
 
Consume(self, token)
Consumes a piece of text.
source code
 
ConsumeComment(self) source code
 
ConsumeCommentOrTrailingComment(self)
Consumes a comment, returns a 2-tuple (trailing bool, comment str).
source code
 
TryConsumeIdentifier(self) source code
 
ConsumeIdentifier(self)
Consumes protocol message field identifier.
source code
 
TryConsumeIdentifierOrNumber(self) source code
 
ConsumeIdentifierOrNumber(self)
Consumes protocol message field identifier.
source code
 
TryConsumeInteger(self) source code
 
ConsumeInteger(self, is_long=False)
Consumes an integer number.
source code
 
TryConsumeFloat(self) source code
 
ConsumeFloat(self)
Consumes an floating point number.
source code
 
ConsumeBool(self)
Consumes a boolean value.
source code
 
TryConsumeByteString(self) source code
 
ConsumeString(self)
Consumes a string value.
source code
 
ConsumeByteString(self)
Consumes a byte array value.
source code
 
_ConsumeSingleByteString(self)
Consume one token of a string literal.
source code
 
ConsumeEnum(self, field) source code
 
ParseErrorPreviousToken(self, message)
Creates and *returns* a ParseError for the previously read token.
source code
 
ParseError(self, message)
Creates and *returns* a ParseError for the current token.
source code
 
_StringParseError(self, e) source code
 
NextToken(self)
Reads the next meaningful token.
source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Class Variables [hide private]
  _WHITESPACE = re.compile(r'\s+')
  _COMMENT = re.compile(r'(?m)(\s*#.*$)')
  _WHITESPACE_OR_COMMENT = re.compile(r'(?m)(\s|(#.*$))+')
  _TOKEN = re.compile(r'[a-zA-Z_][0-9a-zA-Z_\+-]*|([0-9\+-]|(\.[...
  _IDENTIFIER = re.compile(r'[^\d\W]\w*')
  _IDENTIFIER_OR_NUMBER = re.compile(r'\w+')
  mark = '\''
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, lines, skip_comments=True)
(Constructor)

source code 

x.__init__(...) initializes x; see help(type(x)) for signature

Overrides: object.__init__
(inherited documentation)

AtEnd(self)

source code 
Checks the end of the text was reached.

Returns:
  True iff the end was reached.

TryConsume(self, token)

source code 
Tries to consume a given piece of text.

Args:
  token: Text to consume.

Returns:
  True iff the text was consumed.

Consume(self, token)

source code 
Consumes a piece of text.

Args:
  token: Text to consume.

Raises:
  ParseError: If the text couldn't be consumed.

ConsumeIdentifier(self)

source code 
Consumes protocol message field identifier.

Returns:
  Identifier string.

Raises:
  ParseError: If an identifier couldn't be consumed.

ConsumeIdentifierOrNumber(self)

source code 
Consumes protocol message field identifier.

Returns:
  Identifier string.

Raises:
  ParseError: If an identifier couldn't be consumed.

ConsumeInteger(self, is_long=False)

source code 
Consumes an integer number.

Args:
  is_long: True if the value should be returned as a long integer.
Returns:
  The integer parsed.

Raises:
  ParseError: If an integer couldn't be consumed.

ConsumeFloat(self)

source code 
Consumes an floating point number.

Returns:
  The number parsed.

Raises:
  ParseError: If a floating point number couldn't be consumed.

ConsumeBool(self)

source code 
Consumes a boolean value.

Returns:
  The bool parsed.

Raises:
  ParseError: If a boolean value couldn't be consumed.

ConsumeString(self)

source code 
Consumes a string value.

Returns:
  The string parsed.

Raises:
  ParseError: If a string value couldn't be consumed.

ConsumeByteString(self)

source code 
Consumes a byte array value.

Returns:
  The array parsed (as a string).

Raises:
  ParseError: If a byte array value couldn't be consumed.

_ConsumeSingleByteString(self)

source code 
Consume one token of a string literal.

String literals (whether bytes or text) can come in multiple adjacent
tokens which are automatically concatenated, like in C or Python.  This
method only consumes one token.

Returns:
  The token parsed.
Raises:
  ParseError: When the wrong format data is found.

ParseErrorPreviousToken(self, message)

source code 
Creates and *returns* a ParseError for the previously read token.

Args:
  message: A message to set for the exception.

Returns:
  A ParseError instance.


Class Variable Details [hide private]

_TOKEN

Value:
re.compile(r'[a-zA-Z_][0-9a-zA-Z_\+-]*|([0-9\+-]|(\.[0-9]))[0-9a-zA-Z_\
\.\+-]*|"[^"\n\\]*((\\.)+[^"\n\\]*)*("|\\?$)|\'[^\'\n\\]*((\\.)+[^\'\n\
\\]*)*(\'|\\?$)')