- 30 May, 2000 1 commit
-
-
Stefan Monnier authored
(re_match_2_internal): Use it and adjust the end_match_2 logic.
-
- 25 May, 2000 1 commit
-
-
Stefan Monnier authored
of an anchor at the beginning of a shy-group.
-
- 19 Apr, 2000 1 commit
-
-
Stefan Monnier authored
instead define end_match(1|2) more carefully. Use GET_CHAR_BEFORE_2 for `begline'.
-
- 02 Apr, 2000 1 commit
-
-
Stefan Monnier authored
(RE_MULTIBYTE_P, RE_STRING_CHAR_AND_LENGTH): New macros. (GET_CHAR_BEFORE_2): Moved from charset.h plus fixed minor bug when we are between str1 and str2. (MAX_MULTIBYTE_LENGTH, CHAR_STRING) [!emacs]: Provide trivial default. (PATFETCH): Use `TRANSLATE'. (PATFETCH_RAW): Fetch multibyte char if applicable. (PATUNFETCH): Remove. (regex_compile): Rely on PATFETCH to do most of the multibyte magic. When writing a char, write it directly into the pattern buffer rather than going needlessly through a temp char-array. (re_match_2_internal): Similarly, rely on RE_STRING_CHAR to do the multibyte magic and remove the useless `#ifdef emacs'. (bcmp_translate): Don't compare as multibyte chars when in a unibyte buffer. * regex.h (struct re_pattern_buffer): Make field `multibyte' conditional on `emacs'. * charset.h (GET_CHAR_BEFORE_2): Moved to regex.c.
-
- 29 Mar, 2000 1 commit
-
-
Stefan Monnier authored
of re_compile_fastmap and generalizing it a little bit so that it can also just return whether a given (sub)pattern can match the empty string or not. (regex_compile): Use `analyse_first' to decide whether the loop-check needs to be done or not for *, +, *? and +? (the loop check is costly for non-greedy repetition). (re_compile_fastmap): Delegate the actual work to `analyse_first'.
-
- 27 Mar, 2000 1 commit
-
-
Stefan Monnier authored
(enum re_opcode_t): Update description of succeed_n. (PATFETCH): Always define. (regex_compile): Use lookahead rather than PATUNFETCH (for repetition operators, char classes, shy-groups and intervals). Optimize special cases of intervals so as to only use succeed_n and jump_n when really needed. (re_compile_fastmap): Simplify handling of jump_n and succeed_n now that we don't have to handle the special cases any more. Simplify on_failure_jump handling as well.
-
- 26 Mar, 2000 1 commit
-
-
Stefan Monnier authored
(print_partial_compiled_pattern, re_compile_fastmap): Handle new opcode. (regex_compile): Use on_failure_jump_nastyloop for non-greedy loops. (re_match_2_internal): Add code for on_failure_jump_nastyloop when executing it as well as when popping it off the stack to find infinite loops in non-greedy repetition operators.
-
- 23 Mar, 2000 1 commit
-
-
Stefan Monnier authored
(re_compile_fastmap, re_match_2_internal): Undo Dave's previous fix.
-
- 22 Mar, 2000 2 commits
-
-
Dave Love authored
-
Stefan Monnier authored
definitions for non-Emacs compilation. (enum re_opcode_t): Remove (not)wordchar and move (not)syntaxspec outside of `#ifdef emacs'. (print_partial_compiled_pattern): Update. (regex_compile): Use (not)syntaxspec(Sword) instead of (not)wordchar. (re_compile_fastmap): Merge handling of charset and charset_not (for emacs and non-emacs compilation as well). Similarly for (not)categoryspec and (not)syntaxspec. Don't use the fastmap when reaching `anychar' since the added complexity is not justified. (re_match_2_internal): Merge (not)wordchar (emacs and non-emacs) and (not)syntaxspec. Merge (not)categoryspec.
-
- 19 Mar, 2000 1 commit
-
-
Stefan Monnier authored
(GET_CHAR_AFER_2): Remove. (RE_TRANSLATE, RE_TRANSLATE_P): New macros moved from regex.h. (enum re_opcode_t): Remove on_failure_jump_exclusive. (print_partial_compiled_pattern, re_compile_fastmap) (re_match_2_internal): Remove on_failure_jump_exclusive. (regex_compile): Turn optimizable P+ loops into PP*, so that the optimization only need to work for * (ie. can use of_keep_string_jump). Remove the special case for .*\n since it is now covered by the general optimization. (re_search_2): Don't bother with `room'. (skip_one_char): New function. (skip_noops): Simplify since `memory' is not needed any more. (mutually_exclusive_p): Restructure slightly to use `switch' and add handling for "all" remaining cases. (re_match_2_internal): Change on_failure_jump_smart to use on_failure_keep_string_jump (and redirect the end-of-loop jump) rather than on_failure_jump_exclusive.
-
- 16 Mar, 2000 1 commit
-
-
Stefan Monnier authored
POINTER_TO_OFFSET gives the same value before and after PREFETCH. Use `dfail' to guarantee "atomic" matching. (PTR_TO_OFFSET): Use POINTER_TO_OFFSET. (debug): Now only active if > 0 rather than if != 0. (DEBUG_*): Update for the new meaning of `debug'. (print_partial_compiled_pattern): Add missing `succeed' case. Use CHARSET_* macros in the charset(_not) branch. Fix off-by-two bugs in `succeed_n', `jump_n' and `set_number_at'. (store_op1, store_op2, insert_op1, insert_op2) (at_begline_loc_p, at_endline_loc_p): Add prototype. (group_in_compile_stack): Move to after its arg's types are declared and add a prototype. (PATFETCH): Define in terms of PATFETCH_RAW. (GET_UNSIGNED_NUMBER): Add the usual `do { ... } while(0)' wrapper. (QUIT): Redefine as a nop except for NTemacs. (regex_compile): Handle intervals {,M} as if it was {0,M}. Fix indentation of the greedy-op and shy-group code. (at_(beg|end)line_loc_p): Fix argument's types. (re_compile_fastmap): Ifdef out failure_stack_ptr to shut up gcc. (re_search_2): Use POS_AS_IN_BUFFER. Simplify `room' computation. (MATCHING_IN_FIRST_STRING): Remove. (re_match_2): Use POS_AS_IN_BUFFER. Ifdef out failure_stack_ptr to shut up gcc. Use FIRST_STRING_P and POINTER_TO_OFFSET. Use QUIT unconditionally.
-
- 14 Mar, 2000 1 commit
-
-
Stefan Monnier authored
string char type. It's `const unsigned char' to match the rest of Emacs. Consistently make sure all pointers to strings use it and make sure all pointers into the pattern use `unsigned char'. (re_match_2_internal): Use `PREFETCH+STRING_CHAR' instead of GET_CHAR_AFTER_2. Also merge wordbound and notwordbound to reduce code duplication. * charset.h (GET_CHAR_AFTER_2): Remove. (GET_CHAR_BEFORE_2): Use unsigned chars, like everywhere else.
-
- 08 Mar, 2000 1 commit
-
-
Stefan Monnier authored
by bugs revealed when trying to add shy-groups. Overall, what happened is that loops are now structured a little differently, groups can be shy and the code is a little simpler. (enum re_opcode_t): Remove jump_past_alt, maybe_pop_jump, push_dummy_failure and dumy_failure_jump. Add on_failure_jump_(exclusive, loop and smart). Also fix the comment for (start|stop)_memory since they now only take one argument (the second has becomes unnecessary). (print_partial_compiled_pattern): Adjust for changes in re_opcode_t. (print_compiled_pattern): Use %ld to printf long ints and flush to make debugging a little easier. (union fail_stack_elt): Make the integer unsigned. (struct fail_stack_type): Add a `frame' element. (INIT_FAIL_STACK): Init `frame' as well. (POP_PATTERN_OP): New macro for re_compile_fastmap. (DEBUG_PUSH, DEBUG_POP): Remove. (NUM_REG_ITEMS): Remove. (NUM_NONREG_ITEMS): Adjust. (FAILURE_PAT, FAILURE_STR, NEXT_FAILURE_HANDLE, TOP_FAILURE_HANDLE): New macros for the cycle detection. (ENSURE_FAIL_STACK): New macro for PUSH_FAILURE_(REG|POINT). (PUSH_FAILURE_REG, POP_FAILURE_REG, CHECK_INFINITE_LOOP): New macros. (PUSH_FAILURE_POINT): Don't push registers any more. The pattern address pushed is not the destination of the jump but the source of it instead. (NUM_FAILURE_ITEMS): Remove. (POP_FAILURE_POINT): Adapt to the new stack structure (i.e. pop registers before the actual failure point). Don't hardcode any meaning for str==NULL anymore. (union register_info_type, REG_MATCH_NULL_STRING_P, IS_ACTIVE) (MATCHED_SOMETHING, EVER_MATCHED_SOMETHING, SET_REGS_MATCHED): Remove. (REG_UNSET_VALUE): Use NULL (why not?). (compile_range): Remove declaration since it doesn't exist. (struct compile_stack_elt_t): Remove inner_group_offset. (old_reg(start|end), reg_info, reg_dummy, reg_info_dummy): Remove. (regex_grow_registers): Remove dead code. (FIXUP_ALT_JUMP): New macro. (regex_compile): Add shy-groups Change loops to use on_failure_jump_smart&jump instead of on_failure_jump&maybe_pop_jump. Change + loops to eliminate the initial (dummy_failure_)jump. Remove c1_base (looks like unused variable to me). Use `jump' instead of `jump_past_alt' and don't bother with push_dummy_failure in alternatives since it is now unnecessary. Use FIXUP_ALT_JUMP. Eliminate a useless `#ifdef emacs' for (re)allocating the stack. (re_compile_fastmap): Remove dead variables i and num_regs. Exit from loop when bufp->can_be_null rather than jumping to `done'. Avoid jumping backwards so as to ensure termination. Use PATTERN_STACK_EMPTY and POP_PATTERN_OP. Improved handling of backreferences. Remove dead code in handling of `anychar'. (skip_noops, mutually_exclusive_p): New functions taken from the handling of `maybe_pop_jump' in re_match_2_internal. Slightly improve mutually_exclusive_p to handle ".+\n". ((lowest|highest)_active_reg, NO_(LOWEST|HIGHEST)_ACTIVE_REG) Remove. (re_match_2_internal): Use %p instead of 0x%x when printf'ing ptrs. Don't SET_REGS_MATCHED anymore. Remove many dead variables. Push register (in `start_memory') on the stack rather than storing it in old_reg(start|end). Remove the cycle detection from `stop_memory', replaced by the use of on_failure_jump_loop for greedy loops. Add code for the new on_failure_jump_<foo>. Remove ad-hoc code in `on_failure_jump' to push more registers in the case of a loop. Take out code from `maybe_pop_jump' into separate functions and adapt it to the semantics of `on_failure_jump_smart'. Remove jump_past_alt, dummy_failure_jump and push_dummy_failure. Remove dummy_failure handling and handling of `failures to jump to on_failure_jump' (this last one was already dead code, it seems). ((group|alt|common_op)_match_null_string_p): Remove.
-
- 18 Jan, 2000 1 commit
-
-
Richard M. Stallman authored
`charset', skip flag bits for a character class correctly.
-
- 15 Dec, 1999 2 commits
-
-
Dave Love authored
-
Dave Love authored
* regex.c (regex_compile): Adjusted for the change of CHAR_STRING. 1999-12-04 Stefan Monnier <monnier@cs.yale.edu> * regex.c (regex_compile): Recognize *?, +? and ?? as non-greedy operators and handle them properly. * regex.h (RE_ALL_GREEDY): New option. (RE_UNMATCHED_RIGHT_PAREN_ORD): Moved to the end where alphabetic sorting would put it. (RE_SYNTAX_AWK, RE_SYNTAX_GREP, RE_SYNTAX_EGREP) (_RE_SYNTAX_POSIX_COMMON): Use the new option to keep old behavior.
-
- 28 Oct, 1999 1 commit
-
-
Gerd Moellmann authored
as arg to DEBUG_POP and DEBUG_PRINT.
-
- 25 Oct, 1999 1 commit
-
-
Gerd Moellmann authored
-
- 06 Oct, 1999 1 commit
-
-
Dave Love authored
* regex.c [emacs] (ISALNUM, ISALPHA, ISPUNCT): Don't depend on locale [emacs] (ISASCII): Don't define ISASCII in this case. (IS_REAL_ASCII): New macro, 2 alternate definitions. (ISUNIBYTE): Likewise. [emacs] (ISDIGIT, ISCNTRL, ISXDIGIT, ISGRAPH, ISPRINT): Don't use ISASCII. * regex.c: Handle new class names `ascii', `nonascii', `unibyte, `multibyte'. (BIT_ASCII, BIT_NONASCII, BIT_UNIBYTE, BIT_MULTIBYTE): New macros. (IS_CHAR_CLASS): Accept new class names. (regex_compile, re_match_2_internal): Handle the new classes.
-
- 29 Aug, 1999 1 commit
-
-
Richard M. Stallman authored
(ISBLANK, ISGRAPH, ISPRINT, ISALNUM, ISALPHA, ISLOWER) (ISPUNCT, ISSPACE, ISUPPER): New definitions for emacs only. (ISWORD): New macro. (re_opcode_t): Add 2 bytes of flag bits to charset and charset_not. (CHARSET_RANGE_TABLE): Update definition. (CHARSET_RANGE_TABLE_BITS): New macro. (print_partial_compiled_pattern): Skip charset's range table. (struct range_table_work_area): New field `bits'. (SET_RANGE_TABLE_WORK_AREA_BIT): New macro. (BIT_ALNUM, BIT_ALPHA, BIT_WORD, BIT_GRAPH, BIT_LOWER, BIT_PRINT) (BIT_PUNCT, BIT_SPACE, BIT_UPPER): New macros. (CLEAR_RANGE_TABLE_WORK_USED): Clear field `bits'. (RANGE_TABLE_WORK_BITS): New macro. (IS_CHAR_CLASS): Check for "word". (regex_compile): Set the `bits' field for some character classes. Handle the `word' class. Store the `bits' field into the range table. (re_compile_fastmap): Handle flag bits in range table. (re_match_2_internal): For charset and charset_not, handle flag bits in the range table.
-
- 19 Jan, 1999 1 commit
-
-
Richard M. Stallman authored
-
- 30 Dec, 1998 1 commit
-
-
Richard M. Stallman authored
previous change, for charset_not, wordchar, notwordchar, categoryspec, notcategoryspec.
-
- 10 Dec, 1998 1 commit
-
-
Karl Heuer authored
elements for all possible unibyte chars (except newline).
-
- 10 Nov, 1998 1 commit
-
-
Karl Heuer authored
exact-match characters.
-
- 25 Jul, 1998 1 commit
-
-
Richard M. Stallman authored
-
- 09 Jun, 1998 1 commit
-
-
Richard M. Stallman authored
-
- 06 Jun, 1998 1 commit
-
-
Richard M. Stallman authored
(re_match_2, re_search_2): Adjust startpos or pos by 1 only if acting on a buffer. nil for re_match_object means a buffer. (re_match_2_internal <notwordbeg>): Assume POS1 is positive.
-
- 25 May, 1998 1 commit
-
-
Richard M. Stallman authored
(re_match_2_internal): Likewise.
-
- 06 May, 1998 1 commit
-
-
Richard M. Stallman authored
for a repetition operator, don't look beyond end of pattern arg.
-
- 29 Apr, 1998 1 commit
-
-
Andreas Schwab authored
-
- 25 Apr, 1998 1 commit
-
-
Richard M. Stallman authored
Fix the way RANGE is set when handling begbuf.
-
- 15 Apr, 1998 2 commits
-
-
Andreas Schwab authored
needed.
-
Andreas Schwab authored
-
- 12 Apr, 1998 1 commit
-
-
Karl Heuer authored
before calling SETUP_SYNTAX_TABLE_FOR_OBJECT.
-
- 07 Apr, 1998 2 commits
-
-
Karl Heuer authored
-
Richard M. Stallman authored
-
- 04 Apr, 1998 1 commit
-
-
Richard M. Stallman authored
-
- 03 Apr, 1998 1 commit
-
-
Richard M. Stallman authored
(regex_compile): Special handling for range \177-\377. (regex_compile): Cast args to TRANSLATE to unsigned char. (re_search_2): Fix forward scan handling multibyte. Recognize that nonascii characters are not in the fastmap. Handle fetching multibyte characters for backward scan, (re_match_2_internal): Handle multibyte and translation in exactn and anychar. (bcmp_translate): Handle multibyte chars for translation. (TRANSLATE): Don't cast to unsigned char. (PATFETCH): Use RE_TRANSLATE to translate.
-
- 16 Jan, 1998 1 commit
-
-
Richard M. Stallman authored
(re_match_2_internal) <wordbeg, wordend>: Call UPDATE_SYNTAX_TABLE properly with a charpos.
-