Skip to content
Snippets Groups Projects
  1. Jan 24, 2020
  2. Aug 21, 2019
  3. Jun 11, 2019
  4. Mar 26, 2019
  5. Jan 24, 2019
    • Christophe Fergeau's avatar
      json: Fix % handling when not interpolating · bbc0586c
      Christophe Fergeau authored
      Commit 8bca4613 added support for %% in json strings when interpolating,
      but in doing so broke handling of % when not interpolating.
      
      When parse_string() is fed a string token containing '%', it skips the
      '%' regardless of ctxt->ap, i.e. even it's not interpolating.  If the
      '%' is the string's last character, it fails an assertion.  Else, it
      "merely" swallows the '%'.
      
      Fix parse_string() to handle '%' specially only when interpolating.
      
      To gauge the bug's impact, let's review non-interpolating users of this
      parser, i.e. code passing NULL context to json_message_parser_init():
      
      * tests/check-qjson.c, tests/test-qobject-input-visitor.c,
        tests/test-visitor-serialization.c
      
        Plenty of tests, but we still failed to cover the buggy case.
      
      * monitor.c: QMP input
      
      * qga/main.c: QGA input
      
      * qobject_from_json():
      
        - qobject-input-visitor.c: JSON command line option arguments of
          -display and -blockdev
      
          Reproducer: -blockdev '{"%"}'
      
        - block.c: JSON pseudo-filenames starting with "json:"
      
          Reproducer: https://bugzilla.redhat.com/show_bug.cgi?id=1668244#c3
      
      
      
        - block/rbd.c: JSON key pairs
      
          Pseudo-filenames starting with "rbd:".
      
      Command line, QMP and QGA input are trusted.
      
      Filenames are trusted when they come from command line, QMP or HMP.
      They are untrusted when they come from from image file headers.
      Example: QCOW2 backing file name.  Note that this is *not* the security
      boundary between host and guest.  It's the boundary between host and an
      image file from an untrusted source.
      
      Neither failing an assertion nor skipping a character in a filename of
      your choice looks exploitable.  Note that we don't support compiling
      with NDEBUG.
      
      Fixes: 8bca4613
      Cc: qemu-stable@nongnu.org
      Signed-off-by: default avatarChristophe Fergeau <cfergeau@redhat.com>
      Message-Id: <20190102140535.11512-1-cfergeau@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Tested-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      [Commit message extended to discuss impact]
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      bbc0586c
  6. Dec 13, 2018
  7. Oct 26, 2018
  8. Sep 24, 2018
    • Markus Armbruster's avatar
      json: Eliminate lexer state IN_WHITESPACE, pseudo-token JSON_SKIP · 1e960b46
      Markus Armbruster authored
      
      The lexer ignores whitespace like this:
      
               on whitespace      on non-ws   spontaneously
          IN_START --> IN_WHITESPACE --> JSON_SKIP --> IN_START
                          ^    |
                           \__/  on whitespace
      
      This accumulates a whitespace token in state IN_WHITESPACE, only to
      throw it away on the transition via JSON_SKIP to the start state.
      Wasteful.  Go from IN_START to IN_START on whitespace directly,
      dropping the whitespace character.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180831075841.13363-7-armbru@redhat.com>
      1e960b46
    • Markus Armbruster's avatar
      json: Eliminate lexer state IN_ERROR · 2ce4ee64
      Markus Armbruster authored
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180831075841.13363-6-armbru@redhat.com>
      2ce4ee64
    • Markus Armbruster's avatar
      json: Nicer recovery from lexical errors · 0f07a5d5
      Markus Armbruster authored
      
      When the lexer chokes on an input character, it consumes the
      character, emits a JSON error token, and enters its start state.  This
      can lead to suboptimal error recovery.  For instance, input
      
          0123 ,
      
      produces the tokens
      
          JSON_ERROR    01
          JSON_INTEGER  23
          JSON_COMMA    ,
      
      Make the lexer skip characters after a lexical error until a
      structural character ('[', ']', '{', '}', ':', ','), an ASCII control
      character, or '\xFE', or '\xFF'.
      
      Note that we must not skip ASCII control characters, '\xFE', '\xFF',
      because those are documented to force the JSON parser into known-good
      state, by docs/interop/qmp-spec.txt.
      
      The lexer now produces
      
          JSON_ERROR    01
          JSON_COMMA    ,
      
      Update qmp-test for the nicer error recovery: QMP now reports just one
      error for input %p instead of two.  Also drop the newline after %p; it
      was needed to tease out the second error.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180831075841.13363-5-armbru@redhat.com>
      [Conflict with commit ebb4d82d resolved]
      0f07a5d5
    • Markus Armbruster's avatar
      json: Make lexer's "character consumed" logic less confusing · c0ee3afa
      Markus Armbruster authored
      
      The lexer uses macro TERMINAL_NEEDED_LOOKAHEAD() to decide whether a
      state transition consumes the input character.  It returns true when
      the state transition is defined with the TERMINAL() macro.  To detect
      that, it checks whether input '\0' would have resulted in the same
      state transition, and the new state is not IN_ERROR.
      
      Why does that even work?  For all states, the new state on input '\0'
      is either IN_ERROR or defined with TERMINAL().  If the state
      transition equals the one we'd get for input '\0', it goes to IN_ERROR
      or to the argument of TERMINAL().  We never use TERMINAL(IN_ERROR),
      because it makes no sense.  Thus, if it doesn't go to IN_ERROR, it
      must be defined with TERMINAL().
      
      Since this isn't quite confusing enough, we negate the result to get
      @char_consumed, and ignore it when @flush is true.
      
      Instead of deriving the lookahead bit from the state transition, make
      it explicit.  This is easier to understand, and a bit more flexible,
      too.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180831075841.13363-4-armbru@redhat.com>
      c0ee3afa
    • Markus Armbruster's avatar
      json: Clean up how lexer consumes "end of input" · 852dfa76
      Markus Armbruster authored
      
      When the lexer isn't in its start state at the end of input, it's
      working on a token.  To flush it out, it needs to transit to its start
      state on "end of input" lookahead.
      
      There are two ways to the start state, depending on the current state:
      
      * If the lexer is in a TERMINAL(JSON_FOO) state, it can emit a
        JSON_FOO token.
      
      * Else, it can go to IN_ERROR state, and emit a JSON_ERROR token.
      
      There are complications, however:
      
      * The transition to IN_ERROR state consumes the input character and
        adds it to the JSON_ERROR token.  The latter is inappropriate for
        the "end of input" character, so we suppress that.  See also recent
        commit a2ec6be7 "json: Fix lexer to include the bad character in
        JSON_ERROR token".
      
      * The transition to a TERMINAL(JSON_FOO) state doesn't consume the
        input character.  In that case, the lexer normally loops until it is
        consumed.  We have to suppress that for the "end of input" input
        character.  If we didn't, the lexer would consume it by entering
        IN_ERROR state, emitting a bogus JSON_ERROR token.  We fixed that in
        commit bd3924a3.
      
      However, simply breaking the loop this way assumes that the lexer
      needs exactly one state transition to reach its start state.  That
      assumption is correct now, but it's unclean, and I'll soon break it.
      Clean up: instead of breaking the loop after one iteration, break it
      after it reached the start state.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180831075841.13363-3-armbru@redhat.com>
      852dfa76
    • Markus Armbruster's avatar
      json: Fix lexer for lookahead character beyond '\x7F' · 2a96042a
      Markus Armbruster authored
      
      The lexer fails to end a valid token when the lookahead character is
      beyond '\x7F'.  For instance, input
      
          true\xC2\xA2
      
      produces the tokens
      
          JSON_ERROR     true\xC2
          JSON_ERROR     \xA2
      
      This should be
      
          JSON_KEYWORD   true
          JSON_ERROR     \xC2
          JSON_ERROR     \xA2
      
      instead.
      
      The culprit is
      
          #define TERMINAL(state) [0 ... 0x7F] = (state)
      
      It leaves [0x80..0xFF] zero, i.e. IN_ERROR.  Has always been broken.
      Fix it to initialize the complete array.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180831075841.13363-2-armbru@redhat.com>
      2a96042a
  9. Aug 24, 2018
    • Markus Armbruster's avatar
      json: Update references to RFC 7159 to RFC 8259 · 37aded92
      Markus Armbruster authored
      
      RFC 8259 (December 2017) obsoletes RFC 7159 (March 2014).
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Message-Id: <20180823164025.12553-59-armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      37aded92
    • Markus Armbruster's avatar
      json: Support %% in JSON strings when interpolating · 8bca4613
      Markus Armbruster authored
      
      The previous commit makes JSON strings containing '%' awkward to
      express in templates: you'd have to mask the '%' with an Unicode
      escape \u0025.  No template currently contains such JSON strings.
      Support the printf conversion specification %% in JSON strings as a
      convenience anyway, because it's trivially easy to do.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-58-armbru@redhat.com>
      8bca4613
    • Markus Armbruster's avatar
      json: Improve safety of qobject_from_jsonf_nofail() & friends · 16a48599
      Markus Armbruster authored
      
      The JSON parser optionally supports interpolation.  This is used to
      build QObjects by parsing string templates.  The templates are C
      literals, so parse errors (such as invalid interpolation
      specifications) are actually programming errors.  Consequently, the
      functions providing parsing with interpolation
      (qobject_from_jsonf_nofail(), qobject_from_vjsonf_nofail(),
      qdict_from_jsonf_nofail(), qdict_from_vjsonf_nofail()) pass
      &error_abort to the parser.
      
      However, there's another, more dangerous kind of programming error:
      since we use va_arg() to get the value to interpolate, behavior is
      undefined when the variable argument isn't consistent with the
      interpolation specification.
      
      The same problem exists with printf()-like functions, and the solution
      is to have the compiler check consistency.  This is what
      GCC_FMT_ATTR() is about.
      
      To enable this type checking for interpolation as well, we carefully
      chose our interpolation specifications to match printf conversion
      specifications, and decorate functions parsing templates with
      GCC_FMT_ATTR().
      
      Note that this only protects against undefined behavior due to type
      errors.  It can't protect against use of invalid interpolation
      specifications that happen to be valid printf conversion
      specifications.
      
      However, there's still a gaping hole in the type checking: GCC
      recognizes '%' as start of printf conversion specification anywhere in
      the template, but the parser recognizes it only outside JSON strings.
      For instance, if someone were to pass a "{ '%s': %d }" template, GCC
      would require a char * and an int argument, but the parser would
      va_arg() only an int argument, resulting in undefined behavior.
      
      Avoid undefined behavior by catching the programming error at run
      time: have the parser recognize and reject '%' in JSON strings.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-57-armbru@redhat.com>
      16a48599
    • Markus Armbruster's avatar
      json: Keep interpolation state in JSONParserContext · ada74c3b
      Markus Armbruster authored
      
      The recursive descent parser passes along a pointer to
      JSONParserContext.  It additionally passes a pointer to interpolation
      state (a va_alist *) as needed to reach its consumer
      parse_interpolation().
      
      Stuffing the latter pointer into JSONParserContext saves us the
      trouble of passing it along, so do that.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-56-armbru@redhat.com>
      ada74c3b
    • Markus Armbruster's avatar
      json: Clean up headers · 86cdf9ec
      Markus Armbruster authored
      
      The JSON parser has three public headers, json-lexer.h, json-parser.h,
      json-streamer.h.  They all contain stuff that is of no interest
      outside qobject/json-*.c.
      
      Collect the public interface in include/qapi/qmp/json-parser.h, and
      everything else in qobject/json-parser-int.h.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-54-armbru@redhat.com>
      86cdf9ec
    • Markus Armbruster's avatar
      qobject: Drop superfluous includes of qemu-common.h · 812ce33e
      Markus Armbruster authored
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-53-armbru@redhat.com>
      812ce33e
    • Markus Armbruster's avatar
      json: Make JSONToken opaque outside json-parser.c · abe7c206
      Markus Armbruster authored
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-52-armbru@redhat.com>
      abe7c206
    • Markus Armbruster's avatar
      json: Unbox tokens queue in JSONMessageParser · a2731e08
      Markus Armbruster authored
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-51-armbru@redhat.com>
      a2731e08
    • Markus Armbruster's avatar
      json: Streamline json_message_process_token() · 8d3265b3
      Markus Armbruster authored
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-50-armbru@redhat.com>
      8d3265b3
    • Markus Armbruster's avatar
      json: Enforce token count and size limits more tightly · da09cfbf
      Markus Armbruster authored
      
      Token count and size limits exist to guard against excessive heap
      usage.  We check them only after we created the token on the heap.
      That's assigning a cowboy to the barn to lasso the horse after it has
      bolted.  Close the barn door instead: check before we create the
      token.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-49-armbru@redhat.com>
      da09cfbf
    • Markus Armbruster's avatar
      qjson: Have qobject_from_json() & friends reject empty and blank · dd98e848
      Markus Armbruster authored
      
      The last case where qobject_from_json() & friends return null without
      setting an error is empty or blank input.  Callers:
      
      * block.c's parse_json_protocol() reports "Could not parse the JSON
        options".  It's marked as a work-around, because it also covered
        actual bugs, but they got fixed in the previous few commits.
      
      * qobject_input_visitor_new_str() reports "JSON parse error".  Also
        marked as work-around.  The recent fixes have made this unreachable,
        because it currently gets called only for input starting with '{'.
      
      * check-qjson.c's empty_input() and blank_input() demonstrate the
        behavior.
      
      * The other callers are not affected since they only pass input with
        exactly one JSON value or, in the case of negative tests, one error.
      
      Fail with "Expecting a JSON value" instead of returning null, and
      simplify callers.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-48-armbru@redhat.com>
      dd98e848
    • Markus Armbruster's avatar
      json: Assert json_parser_parse() consumes all tokens on success · 5d50113c
      Markus Armbruster authored
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-47-armbru@redhat.com>
      5d50113c
    • Markus Armbruster's avatar
      json: Fix streamer not to ignore trailing unterminated structures · f9277915
      Markus Armbruster authored
      
      json_message_process_token() accumulates tokens until it got the
      sequence of tokens that comprise a single JSON value (it counts curly
      braces and square brackets to decide).  It feeds those token sequences
      to json_parser_parse().  If a non-empty sequence of tokens remains at
      the end of the parse, it's silently ignored.  check-qjson.c cases
      unterminated_array(), unterminated_array_comma(), unterminated_dict(),
      unterminated_dict_comma() demonstrate this bug.
      
      Fix as follows.  Introduce a JSON_END_OF_INPUT token.  When the
      streamer receives it, it feeds the accumulated tokens to
      json_parser_parse().
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-46-armbru@redhat.com>
      f9277915
    • Markus Armbruster's avatar
      json: Fix latent parser aborts at end of input · e06d008a
      Markus Armbruster authored
      
      json-parser.c carefully reports end of input like this:
      
          token = parser_context_pop_token(ctxt);
          if (token == NULL) {
              parse_error(ctxt, NULL, "premature EOI");
              goto out;
          }
      
      Except parser_context_pop_token() can't return null, it fails its
      assertion instead.  Same for parser_context_peek_token().  Broken in
      commit 65c0f1e9, and faithfully preserved in commit 95385fe9.
      Only a latent bug, because the streamer throws away any input that
      could trigger it.
      
      Drop the assertions, so we can fix the streamer in the next commit.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-45-armbru@redhat.com>
      e06d008a
    • Markus Armbruster's avatar
      qjson: Fix qobject_from_json() & friends for multiple values · 2a4794ba
      Markus Armbruster authored
      
      qobject_from_json() & friends use the consume_json() callback to
      receive either a value or an error from the parser.
      
      When they are fed a string that contains more than either one JSON
      value or one JSON syntax error, consume_json() gets called multiple
      times.
      
      When the last call receives a value, qobject_from_json() returns that
      value.  Any other values are leaked.
      
      When any call receives an error, qobject_from_json() sets the first
      error received.  Any other errors are thrown away.
      
      When values follow errors, qobject_from_json() returns both a value
      and sets an error.  That's bad.  Impact:
      
      * block.c's parse_json_protocol() ignores and leaks the value.  It's
        used to to parse pseudo-filenames starting with "json:".  The
        pseudo-filenames can come from the user or from image meta-data such
        as a QCOW2 image's backing file name.
      
      * vl.c's parse_display_qapi() ignores and leaks the error.  It's used
        to parse the argument of command line option -display.
      
      * vl.c's main() case QEMU_OPTION_blockdev ignores the error and leaves
        it in @err.  main() will then pass a pointer to a non-null Error *
        to net_init_clients(), which is forbidden.  It can lead to assertion
        failure or other misbehavior.
      
      * check-qjson.c's multiple_values() demonstrates the badness.
      
      * The other callers are not affected since they only pass strings with
        exactly one JSON value or, in the case of negative tests, one
        error.
      
      The impact on the _nofail() functions is relatively harmless.  They
      abort when any call receives an error.  Else they return the last
      value, and leak the others, if any.
      
      Fix consume_json() as follows.  On the first call, save value and
      error as before.  On subsequent calls, if any, don't save them.  If
      the first call saved a value, the next call, if any, replaces the
      value by an "Expecting at most one JSON value" error.  Take care not
      to leak values or errors that aren't saved.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-44-armbru@redhat.com>
      2a4794ba
    • Markus Armbruster's avatar
      json: Improve names of lexer states related to numbers · 4d400661
      Markus Armbruster authored
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-43-armbru@redhat.com>
      4d400661
    • Markus Armbruster's avatar
      json: Replace %I64d, %I64u by %PRId64, %PRIu64 · 53a0d616
      Markus Armbruster authored
      
      Support for %I64d got added in commit 2c0d4b36 "json: fix PRId64 on
      Win32".  We had to hard-code I64d because we used the lexer's finite
      state machine to check interpolations.  No more, so clean this up.
      
      Additional conversion specifications would be easy enough to implement
      when needed.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-42-armbru@redhat.com>
      53a0d616
    • Markus Armbruster's avatar
      json: Leave rejecting invalid interpolation to parser · f7617d45
      Markus Armbruster authored
      
      Both lexer and parser reject invalid interpolation specifications.
      The parser's check is useless.
      
      The lexer ends the token right after the first bad character.  This
      tends to lead to suboptimal error reporting.  For instance, input
      
          [ %04d ]
      
      produces the tokens
      
          JSON_LSQUARE  [
          JSON_ERROR    %0
          JSON_INTEGER  4
          JSON_KEYWORD  d
          JSON_RSQUARE  ]
      
      The parser then yields an error, an object and two more errors:
      
          error: Invalid JSON syntax
          object: 4
          error: JSON parse error, invalid keyword
          error: JSON parse error, expecting value
      
      Dumb down the lexer to accept [A-Za-z0-9]*.  The parser's check is now
      used.  Emit a proper error there.
      
      The lexer now produces
      
          JSON_LSQUARE  [
          JSON_INTERP   %04d
          JSON_RSQUARE  ]
      
      and the parser reports just
      
          JSON parse error, invalid interpolation '%04d'
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-41-armbru@redhat.com>
      f7617d45
    • Markus Armbruster's avatar
      json: Pass lexical errors and limit violations to callback · 84a56f38
      Markus Armbruster authored
      
      The callback to consume JSON values takes QObject *json, Error *err.
      If both are null, the callback is supposed to make up an error by
      itself.  This sucks.
      
      qjson.c's consume_json() neglects to do so, which makes
      qobject_from_json() null instead of failing.  I consider that a bug.
      
      The culprit is json_message_process_token(): it passes two null
      pointers when it runs into a lexical error or a limit violation.  Fix
      it to pass a proper Error object then.  Update the callbacks:
      
      * monitor.c's handle_qmp_command(): the code to make up an error is
        now dead, drop it.
      
      * qga/main.c's process_event(): lumps the "both null" case together
        with the "not a JSON object" case.  The former is now gone.  The
        error message "Invalid JSON syntax" is misleading for the latter.
        Improve it to "Input must be a JSON object".
      
      * qobject/qjson.c's consume_json(): no update; check-qjson
        demonstrates qobject_from_json() now sets an error on lexical
        errors, but still doesn't on some other errors.
      
      * tests/libqtest.c's qmp_response(): the Error object is now reliable,
        so use it to improve the error message.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-40-armbru@redhat.com>
      84a56f38
    • Markus Armbruster's avatar
      json: Treat unwanted interpolation as lexical error · 2cbd15aa
      Markus Armbruster authored
      
      The JSON parser optionally supports interpolation.  The lexer
      recognizes interpolation tokens unconditionally.  The parser rejects
      them when interpolation is disabled, in parse_interpolation().
      However, it neglects to set an error then, which can make
      json_parser_parse() fail without setting an error.
      
      Move the check for unwanted interpolation from the parser's
      parse_interpolation() into the lexer's finite state machine.  When
      interpolation is disabled, '%' is now handled like any other
      unexpected character.
      
      The next commit will improve how such lexical errors are handled.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-39-armbru@redhat.com>
      2cbd15aa
    • Markus Armbruster's avatar
      json: Rename token JSON_ESCAPE & friends to JSON_INTERP · 61030280
      Markus Armbruster authored
      
      The JSON parser optionally supports interpolation.  The code calls it
      "escape".  Awkward, because it uses the same term for escape sequences
      within strings.  The latter usage is consistent with RFC 8259 "The
      JavaScript Object Notation (JSON) Data Interchange Format" and ISO C.
      Call the former "interpolation" instead.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-38-armbru@redhat.com>
      61030280
    • Markus Armbruster's avatar
      json: Don't create JSON_ERROR tokens that won't be used · 269e57ae
      Markus Armbruster authored
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-37-armbru@redhat.com>
      269e57ae
    • Markus Armbruster's avatar
      json: Don't pass null @tokens to json_parser_parse() · ff281a27
      Markus Armbruster authored
      
      json_parser_parse() normally returns the QObject on success.  Except
      it returns null when its @tokens argument is null.
      
      Its only caller json_message_process_token() passes null @tokens when
      emitting a lexical error.  The call is a rather opaque way to say json
      = NULL then.
      
      Simplify matters by lifting the assignment to json out of the emit
      path: initialize json to null, set it to the value of
      json_parser_parse() when there's no lexical error.  Drop the special
      case from json_parser_parse().
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-36-armbru@redhat.com>
      ff281a27
    • Markus Armbruster's avatar
      json: Redesign the callback to consume JSON values · 62815d85
      Markus Armbruster authored
      
      The classical way to structure parser and lexer is to have the client
      call the parser to get an abstract syntax tree, the parser call the
      lexer to get the next token, and the lexer call some function to get
      input characters.
      
      Another way to structure them would be to have the client feed
      characters to the lexer, the lexer feed tokens to the parser, and the
      parser feed abstract syntax trees to some callback provided by the
      client.  This way is more easily integrated into an event loop that
      dispatches input characters as they arrive.
      
      Our JSON parser is kind of between the two.  The lexer feeds tokens to
      a "streamer" instead of a real parser.  The streamer accumulates
      tokens until it got the sequence of tokens that comprise a single JSON
      value (it counts curly braces and square brackets to decide).  It
      feeds those token sequences to a callback provided by the client.  The
      callback passes each token sequence to the parser, and gets back an
      abstract syntax tree.
      
      I figure it was done that way to make a straightforward recursive
      descent parser possible.  "Get next token" becomes "pop the first
      token off the token sequence".  Drawback: we need to store a complete
      token sequence.  Each token eats 13 + input characters + malloc
      overhead bytes.
      
      Observations:
      
      1. This is not the only way to use recursive descent.  If we replaced
         "get next token" by a coroutine yield, we could do without a
         streamer.
      
      2. The lexer reports errors by passing a JSON_ERROR token to the
         streamer.  This communicates the offending input characters and
         their location, but no more.
      
      3. The streamer reports errors by passing a null token sequence to the
         callback.  The (already poor) lexical error information is thrown
         away.
      
      4. Having the callback receive a token sequence duplicates the code to
         convert token sequence to abstract syntax tree in every callback.
      
      5. Known bug: the streamer silently drops incomplete token sequences.
      
      This commit rectifies 4. by lifting the call of the parser from the
      callbacks into the streamer.  Later commits will address 3. and 5.
      
      The lifting removes a bug from qjson.c's parse_json(): it passed a
      pointer to a non-null Error * in certain cases, as demonstrated by
      check-qjson.c.
      
      json_parser_parse() is now unused.  It's a stupid wrapper around
      json_parser_parse_err().  Drop it, and rename json_parser_parse_err()
      to json_parser_parse().
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-35-armbru@redhat.com>
      62815d85
    • Markus Armbruster's avatar
      json: Have lexer call streamer directly · 037f2440
      Markus Armbruster authored
      
      json_lexer_init() takes the function to process a token as an
      argument.  It's always json_message_process_token().  Makes the code
      harder to understand for no actual gain.  Drop the indirection.
      
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarEric Blake <eblake@redhat.com>
      Message-Id: <20180823164025.12553-34-armbru@redhat.com>
      037f2440
    • Marc-André Lureau's avatar
      json-parser: simplify and avoid JSONParserContext allocation · e8b19d7d
      Marc-André Lureau authored
      
      parser_context_new/free() are only used from json_parser_parse(). We
      can fold the code there and avoid an allocation altogether.
      
      Signed-off-by: default avatarMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <20180719184111.5129-9-marcandre.lureau@redhat.com>
      Reviewed-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Message-Id: <20180823164025.12553-33-armbru@redhat.com>
      e8b19d7d
    • Marc-André Lureau's avatar
      json: remove useless return value from lexer/parser · 7c1e1d54
      Marc-André Lureau authored
      
      The lexer always returns 0 when char feeding. Furthermore, none of the
      caller care about the return value.
      
      Signed-off-by: default avatarMarc-André Lureau <marcandre.lureau@redhat.com>
      Message-Id: <20180326150916.9602-10-marcandre.lureau@redhat.com>
      Reviewed-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Reviewed-by: default avatarThomas Huth <thuth@redhat.com>
      Signed-off-by: default avatarMarkus Armbruster <armbru@redhat.com>
      Message-Id: <20180823164025.12553-32-armbru@redhat.com>
      7c1e1d54
Loading