1. 30 May, 2000 1 commit
  2. 25 May, 2000 1 commit
  3. 19 Apr, 2000 1 commit
  4. 02 Apr, 2000 1 commit
    • Stefan Monnier's avatar
      * regex.c (PTR_TO_OFFSET) [!emacs]: Remove. · 2d1675e4
      Stefan Monnier authored
      (RE_MULTIBYTE_P, RE_STRING_CHAR_AND_LENGTH): New macros.
      (GET_CHAR_BEFORE_2): Moved from charset.h plus fixed minor bug when
      we are between str1 and str2.
      (MAX_MULTIBYTE_LENGTH, CHAR_STRING) [!emacs]: Provide trivial default.
      (PATFETCH): Use `TRANSLATE'.
      (PATFETCH_RAW): Fetch multibyte char if applicable.
      (PATUNFETCH): Remove.
      (regex_compile): Rely on PATFETCH to do most of the multibyte magic.
      When writing a char, write it directly into the pattern buffer rather
      than going needlessly through a temp char-array.
      (re_match_2_internal): Similarly, rely on RE_STRING_CHAR to do the
      multibyte magic and remove the useless `#ifdef emacs'.
      (bcmp_translate): Don't compare as multibyte chars when in a unibyte
      buffer.
      * regex.h (struct re_pattern_buffer): Make field `multibyte'
      conditional on `emacs'.
      * charset.h (GET_CHAR_BEFORE_2): Moved to regex.c.
      2d1675e4
  5. 29 Mar, 2000 1 commit
    • Stefan Monnier's avatar
      (analyse_first): New function obtained by ripping out most · f6a3f532
      Stefan Monnier authored
      of re_compile_fastmap and generalizing it a little bit so that it
      can also just return whether a given (sub)pattern can match the empty
      string or not.
      (regex_compile): Use `analyse_first' to decide whether the loop-check
      needs to be done or not for *, +, *? and +? (the loop check is costly
      for non-greedy repetition).
      (re_compile_fastmap): Delegate the actual work to `analyse_first'.
      f6a3f532
  6. 27 Mar, 2000 1 commit
    • Stefan Monnier's avatar
      (REGEX_FREE_STACK, RESET_FAIL_STACK): Make them usable as an expression. · ed0767d8
      Stefan Monnier authored
      (enum re_opcode_t): Update description of succeed_n.
      (PATFETCH): Always define.
      (regex_compile): Use lookahead rather than PATUNFETCH (for repetition
      operators, char classes, shy-groups and intervals).
      Optimize special cases of intervals so as to only use succeed_n and
      jump_n when really needed.
      (re_compile_fastmap): Simplify handling of jump_n and succeed_n now
      that we don't have to handle the special cases any more.
      Simplify on_failure_jump handling as well.
      ed0767d8
  7. 26 Mar, 2000 1 commit
    • Stefan Monnier's avatar
      (enum re_opcode_t): New opcode on_failure_jump_nastyloop. · 0683b6fa
      Stefan Monnier authored
      (print_partial_compiled_pattern, re_compile_fastmap): Handle new opcode.
      (regex_compile): Use on_failure_jump_nastyloop for non-greedy loops.
      (re_match_2_internal): Add code for on_failure_jump_nastyloop when
      executing it as well as when popping it off the stack to find infinite
      loops in non-greedy repetition operators.
      0683b6fa
  8. 23 Mar, 2000 1 commit
  9. 22 Mar, 2000 2 commits
    • Dave Love's avatar
    • Stefan Monnier's avatar
      (CHAR_CHARSET, CHARSET_LEADING_CODE_BASE): Add default · 1fb352e0
      Stefan Monnier authored
      definitions for non-Emacs compilation.
      (enum re_opcode_t): Remove (not)wordchar and move (not)syntaxspec
      outside of `#ifdef emacs'.
      (print_partial_compiled_pattern): Update.
      (regex_compile): Use (not)syntaxspec(Sword) instead of (not)wordchar.
      (re_compile_fastmap): Merge handling of charset and charset_not (for
      emacs and non-emacs compilation as well).
      Similarly for (not)categoryspec and (not)syntaxspec.
      Don't use the fastmap when reaching `anychar' since the added
      complexity is not justified.
      (re_match_2_internal): Merge (not)wordchar (emacs and non-emacs)
      and (not)syntaxspec.  Merge (not)categoryspec.
      1fb352e0
  10. 19 Mar, 2000 1 commit
    • Stefan Monnier's avatar
      (RE_STRING_CHAR): New macro. · 4e8a9132
      Stefan Monnier authored
      (GET_CHAR_AFER_2): Remove.
      (RE_TRANSLATE, RE_TRANSLATE_P): New macros moved from regex.h.
      (enum re_opcode_t): Remove on_failure_jump_exclusive.
      (print_partial_compiled_pattern, re_compile_fastmap)
      (re_match_2_internal): Remove on_failure_jump_exclusive.
      (regex_compile): Turn optimizable P+ loops into PP*, so that the
      optimization only need to work for * (ie. can use of_keep_string_jump).
      Remove the special case for .*\n since it is now covered by the general
      optimization.
      (re_search_2): Don't bother with `room'.
      (skip_one_char): New function.
      (skip_noops): Simplify since `memory' is not needed any more.
      (mutually_exclusive_p): Restructure slightly to use `switch' and
      add handling for "all" remaining cases.
      (re_match_2_internal): Change on_failure_jump_smart to use
      on_failure_keep_string_jump (and redirect the end-of-loop jump)
      rather than on_failure_jump_exclusive.
      4e8a9132
  11. 16 Mar, 2000 1 commit
    • Stefan Monnier's avatar
      (re_match_2): Fix string shortening (to fit `stop') to make sure · 99633e97
      Stefan Monnier authored
      POINTER_TO_OFFSET gives the same value before and after PREFETCH.
      Use `dfail' to guarantee "atomic" matching.
      (PTR_TO_OFFSET): Use POINTER_TO_OFFSET.
      (debug): Now only active if > 0 rather than if != 0.
      (DEBUG_*): Update for the new meaning of `debug'.
      (print_partial_compiled_pattern): Add missing `succeed' case.
      Use CHARSET_* macros in the charset(_not) branch.
      Fix off-by-two bugs in `succeed_n', `jump_n' and `set_number_at'.
      (store_op1, store_op2, insert_op1, insert_op2)
      (at_begline_loc_p, at_endline_loc_p): Add prototype.
      (group_in_compile_stack): Move to after its arg's types are declared
      and add a prototype.
      (PATFETCH): Define in terms of PATFETCH_RAW.
      (GET_UNSIGNED_NUMBER): Add the usual `do { ... } while(0)' wrapper.
      (QUIT): Redefine as a nop except for NTemacs.
      (regex_compile): Handle intervals {,M} as if it was {0,M}.
      Fix indentation of the greedy-op and shy-group code.
      (at_(beg|end)line_loc_p): Fix argument's types.
      (re_compile_fastmap): Ifdef out failure_stack_ptr to shut up gcc.
      (re_search_2): Use POS_AS_IN_BUFFER.  Simplify `room' computation.
      (MATCHING_IN_FIRST_STRING): Remove.
      (re_match_2): Use POS_AS_IN_BUFFER.
      Ifdef out failure_stack_ptr to shut up gcc.
      Use FIRST_STRING_P and POINTER_TO_OFFSET.
      Use QUIT unconditionally.
      99633e97
  12. 14 Mar, 2000 1 commit
    • Stefan Monnier's avatar
      * regex.c: Declare a new type `re_char' used throughout the code for the · 66f0296e
      Stefan Monnier authored
      string char type.  It's `const unsigned char' to match the rest of Emacs.
      Consistently make sure all pointers to strings use it and make sure all
      pointers into the pattern use `unsigned char'.
      (re_match_2_internal): Use `PREFETCH+STRING_CHAR' instead of
      GET_CHAR_AFTER_2.
      Also merge wordbound and notwordbound to reduce code duplication.
      * charset.h (GET_CHAR_AFTER_2): Remove.
      (GET_CHAR_BEFORE_2): Use unsigned chars, like everywhere else.
      66f0296e
  13. 08 Mar, 2000 1 commit
    • Stefan Monnier's avatar
      This is a big redesign of failure-stack and register handling, prompted · 505bde11
      Stefan Monnier authored
      by bugs revealed when trying to add shy-groups.  Overall, what happened
      is that loops are now structured a little differently, groups can be
      shy and the code is a little simpler.
      
      (enum re_opcode_t): Remove jump_past_alt, maybe_pop_jump,
      push_dummy_failure and dumy_failure_jump.
      Add on_failure_jump_(exclusive, loop and smart).
      Also fix the comment for (start|stop)_memory since they now only take
      one argument (the second has becomes unnecessary).
      (print_partial_compiled_pattern): Adjust for changes in re_opcode_t.
      (print_compiled_pattern): Use %ld to printf long ints and flush to make
      debugging a little easier.
      (union fail_stack_elt): Make the integer unsigned.
      (struct fail_stack_type): Add a `frame' element.
      (INIT_FAIL_STACK): Init `frame' as well.
      (POP_PATTERN_OP): New macro for re_compile_fastmap.
      (DEBUG_PUSH, DEBUG_POP): Remove.
      (NUM_REG_ITEMS): Remove.
      (NUM_NONREG_ITEMS): Adjust.
      (FAILURE_PAT, FAILURE_STR, NEXT_FAILURE_HANDLE, TOP_FAILURE_HANDLE):
      New macros for the cycle detection.
      (ENSURE_FAIL_STACK): New macro for PUSH_FAILURE_(REG|POINT).
      (PUSH_FAILURE_REG, POP_FAILURE_REG, CHECK_INFINITE_LOOP): New macros.
      (PUSH_FAILURE_POINT): Don't push registers any more.
      The pattern address pushed is not the destination of the jump
      but the source of it instead.
      (NUM_FAILURE_ITEMS): Remove.
      (POP_FAILURE_POINT): Adapt to the new stack structure (i.e. pop
      registers before the actual failure point).
      Don't hardcode any meaning for str==NULL anymore.
      (union register_info_type, REG_MATCH_NULL_STRING_P, IS_ACTIVE)
      (MATCHED_SOMETHING, EVER_MATCHED_SOMETHING, SET_REGS_MATCHED): Remove.
      (REG_UNSET_VALUE): Use NULL (why not?).
      (compile_range): Remove declaration since it doesn't exist.
      (struct compile_stack_elt_t): Remove inner_group_offset.
      (old_reg(start|end), reg_info, reg_dummy, reg_info_dummy): Remove.
      (regex_grow_registers): Remove dead code.
      (FIXUP_ALT_JUMP): New macro.
      (regex_compile): Add shy-groups
      Change loops to use	on_failure_jump_smart&jump instead of
      on_failure_jump&maybe_pop_jump.
      Change + loops to eliminate the initial (dummy_failure_)jump.
      Remove c1_base (looks like unused variable to me).
      Use `jump' instead of `jump_past_alt' and don't bother with
      push_dummy_failure in alternatives since it is now unnecessary.
      Use FIXUP_ALT_JUMP.
      Eliminate a useless `#ifdef emacs' for (re)allocating the stack.
      (re_compile_fastmap): Remove dead variables i and num_regs.
      Exit from loop when bufp->can_be_null rather than jumping to `done'.
      Avoid jumping backwards so as to ensure termination.
      Use PATTERN_STACK_EMPTY and POP_PATTERN_OP.
      Improved handling of backreferences.
      Remove dead code in handling of `anychar'.
      (skip_noops, mutually_exclusive_p): New functions taken from the
      handling of `maybe_pop_jump' in re_match_2_internal.
      Slightly improve mutually_exclusive_p to handle ".+\n".
      ((lowest|highest)_active_reg, NO_(LOWEST|HIGHEST)_ACTIVE_REG)
      Remove.
      (re_match_2_internal): Use %p instead of 0x%x when printf'ing ptrs.
      Don't SET_REGS_MATCHED anymore.  Remove many dead variables.
      Push register (in `start_memory') on the stack rather than storing it
      in old_reg(start|end).
      Remove the cycle detection from `stop_memory', replaced by the use
      of on_failure_jump_loop for greedy loops.
      Add code for the new on_failure_jump_<foo>.
      Remove ad-hoc code in `on_failure_jump' to push more registers
      in the case of a loop.
      Take out code from `maybe_pop_jump' into separate functions and
      adapt it to the semantics of `on_failure_jump_smart'.
      Remove jump_past_alt, dummy_failure_jump and push_dummy_failure.
      Remove dummy_failure handling and handling of `failures to jump
      to on_failure_jump' (this last one was already dead code, it seems).
      ((group|alt|common_op)_match_null_string_p): Remove.
      505bde11
  14. 18 Jan, 2000 1 commit
  15. 15 Dec, 1999 2 commits
    • Dave Love's avatar
      Copyright up-date. · 362ba9c6
      Dave Love authored
      362ba9c6
    • Dave Love's avatar
      1999-12-15 Kenichi Handa <handa@etl.go.jp> · 1c8c6d39
      Dave Love authored
      	* regex.c (regex_compile): Adjusted for the change of CHAR_STRING.
      
      1999-12-04  Stefan Monnier  <monnier@cs.yale.edu>
      
      	* regex.c (regex_compile): Recognize *?, +? and ?? as non-greedy
      	operators and handle them properly.
      	* regex.h (RE_ALL_GREEDY): New option.
      	(RE_UNMATCHED_RIGHT_PAREN_ORD): Moved to the end where alphabetic
      	sorting would put it.
      	(RE_SYNTAX_AWK, RE_SYNTAX_GREP, RE_SYNTAX_EGREP)
      	(_RE_SYNTAX_POSIX_COMMON): Use the new option to keep old behavior.
      1c8c6d39
  16. 28 Oct, 1999 1 commit
  17. 25 Oct, 1999 1 commit
  18. 06 Oct, 1999 1 commit
    • Dave Love's avatar
      1999-09-04 Richard M. Stallman <rms@gnu.org> · f71b19b6
      Dave Love authored
              * regex.c [emacs] (ISALNUM, ISALPHA, ISPUNCT): Don't depend on locale
              [emacs] (ISASCII): Don't define ISASCII in this case.
              (IS_REAL_ASCII): New macro, 2 alternate definitions.
              (ISUNIBYTE): Likewise.
              [emacs] (ISDIGIT, ISCNTRL, ISXDIGIT, ISGRAPH, ISPRINT):
              Don't use ISASCII.
      
              * regex.c: Handle new class names `ascii', `nonascii',
              `unibyte, `multibyte'.
              (BIT_ASCII, BIT_NONASCII, BIT_UNIBYTE, BIT_MULTIBYTE): New macros.
              (IS_CHAR_CLASS): Accept new class names.
              (regex_compile, re_match_2_internal): Handle the new classes.
      f71b19b6
  19. 29 Aug, 1999 1 commit
    • Richard M. Stallman's avatar
      [emacs]: Handle character classes for multibyte chars: · 96cc36cc
      Richard M. Stallman authored
      (ISBLANK, ISGRAPH, ISPRINT, ISALNUM, ISALPHA, ISLOWER)
      (ISPUNCT, ISSPACE, ISUPPER): New definitions for emacs only.
      (ISWORD): New macro.
      (re_opcode_t): Add 2 bytes of flag bits to charset and charset_not.
      (CHARSET_RANGE_TABLE): Update definition.
      (CHARSET_RANGE_TABLE_BITS): New macro.
      (print_partial_compiled_pattern): Skip charset's range table.
      (struct range_table_work_area): New field `bits'.
      (SET_RANGE_TABLE_WORK_AREA_BIT): New macro.
      (BIT_ALNUM, BIT_ALPHA, BIT_WORD, BIT_GRAPH, BIT_LOWER, BIT_PRINT)
      (BIT_PUNCT, BIT_SPACE, BIT_UPPER): New macros.
      (CLEAR_RANGE_TABLE_WORK_USED): Clear field `bits'.
      (RANGE_TABLE_WORK_BITS): New macro.
      (IS_CHAR_CLASS): Check for "word".
      (regex_compile): Set the `bits' field for some character classes.
      Handle the `word' class.  Store the `bits' field into the range table.
      (re_compile_fastmap): Handle flag bits in range table.
      (re_match_2_internal): For charset and charset_not,
      handle flag bits in the range table.
      96cc36cc
  20. 19 Jan, 1999 1 commit
  21. 30 Dec, 1998 1 commit
  22. 10 Dec, 1998 1 commit
  23. 10 Nov, 1998 1 commit
  24. 25 Jul, 1998 1 commit
  25. 09 Jun, 1998 1 commit
  26. 06 Jun, 1998 1 commit
  27. 25 May, 1998 1 commit
  28. 06 May, 1998 1 commit
  29. 29 Apr, 1998 1 commit
  30. 25 Apr, 1998 1 commit
  31. 15 Apr, 1998 2 commits
  32. 12 Apr, 1998 1 commit
  33. 07 Apr, 1998 2 commits
  34. 04 Apr, 1998 1 commit
  35. 03 Apr, 1998 1 commit
    • Richard M. Stallman's avatar
      (compile_range): Unused function deleted. · e934739e
      Richard M. Stallman authored
      (regex_compile): Special handling for range \177-\377.
      (regex_compile): Cast args to TRANSLATE to unsigned char.
      (re_search_2): Fix forward scan handling multibyte.
      Recognize that nonascii characters are not in the fastmap.
      Handle fetching multibyte characters for backward scan,
      (re_match_2_internal): Handle multibyte and translation
      in exactn and anychar.
      (bcmp_translate): Handle multibyte chars for translation.
      
      (TRANSLATE): Don't cast to  unsigned char.
      
      (PATFETCH): Use RE_TRANSLATE to translate.
      e934739e
  36. 16 Jan, 1998 1 commit