You've got a really great bounty, so I'm going to try to nail it comprehensively.
You make assumptions in your regular expression handling that are incorrect.
- Oracle is NOT compatible with Perl regular expressions, it iscompatible with POSIX. It describes its support for Perl as"Perl-Influenced"
- There is an intrinsic syntax conflict around the use of the Perl "*?" in Oracle, if youread that reference the way I do, and Oracle legitimately chooses the POSIX usage
- Your description of how perl handles "*?" is not quite right.
Here is a mashup of the options we've discussed. The key to this issue is around case 30
CASE SRC TEXT RE FROM_WHOM RESULT ------- ------------------------------- ------------------ ----------------- -------------------------------------------------- -------------- 1 Egor's original source string A=1,B=2,C=3, .*B=.*?, Egor's original pattern "doesn't work" A=1,B=2,C=3, 2 Egor's original source string A=1,B=2,C=3, .*B=.?, Egor's "works correctly" A=1,B=2, 3 Egor's original source string A=1,B=2,C=3, .*B=.+?, Old Pro comment 1 form 2 A=1,B=2, 4 Egor's original source string A=1,B=2,C=3, .+B=.*?, Old Pro comment 1 form 1 A=1,B=2, 5 Egor's original source string A=1,B=2,C=3, .*B=.{0,}?, Old Pro comment 2 A=1,B=2, 6 Egor's original source string A=1,B=2,C=3, [^B]*B=[^Bx]*?, Old Pro answer form 1 "good" A=1,B=2, 7 Egor's original source string A=1,B=2,C=3, [^B]*B=[^B]*?, Old Pro answer form 2 "bad" A=1,B=2,C=3, 8 Egor's original source string A=1,B=2,C=3, (.)*B=(.)*?, TBone answer form 1 A=1,B=2, 9 TBone answer example 2 1_@_2_a_3_@_4_a (\w)*?@(\w)* TBone answer example 2 form 1 1_@_2_a_3_ 10 TBone answer example 2 1_@_2_a_3_@_4_a (\w)*@(\w)*? TBone answer example 2 form 2 1_@ 30 Egor's original source string A=1,B=2,C=3, .*B=(.)*?, Schemaczar Variant to force Perl operation A=1,B=2, 31 Egor's original source string A=1,B=2,C=3, .*B=(.*)?, Schemaczar Variant of Egor to force POSIX A=1,B=2,C=3, 32 Egor's original source string A=1,B=2,C=3, .*B=.*{0,1} Schemaczar Applying Egor's 'non-greedy' A=1,B=2,C=3, 33 Egor's original source string A=1,B=2,C=3, .*B=(.)*{0,1} Schemaczar Another variant of Egor's "non-greedy" A=1,B=2,C=3,
I am pretty sure that CASE 30 is what you thought you were writing - that is, you thought the "*?" had a stronger association than the "*" by itself. True for Perl, I guess, but for Oracle (and presumably canonical POSIX) RE's, the "*?" has a lower precedence and associativity than "*". So Oracle reads it as "(.*)?" (case 31) whereas Perl reads it as "(.)*?", that is, case 30.
Note cases 32 and 33 indicate that "*{0,1}" does not work like "*?".
Note that Oracle REGEXP does not work like LIKE, that is, it does not require the match pattern to cover the entire test string. Using the "^" begin and "$" end markers might help you with this as well.
My script:
SET SERVEROUTPUT ON<<DISCREET_DROP>> begin DBMS_OUTPUT.ENABLE; for dropit in (select 'DROP TABLE ' || TABLE_NAME || ' CASCADE CONSTRAINTS' AS SYNT FROM TABS WHERE TABLE_NAME IN ('TEST_PATS', 'TEST_STRINGS') ) LOOP DBMS_OUTPUT.PUT_LINE('Dropping via ' || dropit.synt); execute immediate dropit.synt; END LOOP;END DISCREET_DROP;/---------------------------------------------------------- DDL for Table TEST_PATS-------------------------------------------------------- CREATE TABLE TEST_PATS ( RE VARCHAR2(2000), FROM_WHOM VARCHAR2(50), PAT_GROUP VARCHAR2(50), PAT_ORDER NUMBER(9,0) ) ;/---------------------------------------------------------- DDL for Table TEST_STRINGS-------------------------------------------------------- CREATE TABLE TEST_STRINGS ( TEXT VARCHAR2(2000), SRC VARCHAR2(200), TEXT_GROUP VARCHAR2(50), TEXT_ORDER NUMBER(9,0) ) ;/---------------------------------------------------------- DDL for View REGEXP_TESTER_V-------------------------------------------------------- CREATE OR REPLACE FORCE VIEW REGEXP_TESTER_V (CASE_NUMBER, SRC, TEXT, RE, FROM_WHOM, RESULT) AS select pat_order as case_number, src, text, re, from_whom, regexp_substr (text, re) as resultfrom test_pats full outer join test_strings on (text_group = pat_group)order by pat_order, text_order;/REM INSERTING into TEST_PATSSET DEFINE OFF;Insert into TEST_PATS (RE,FROM_WHOM,PAT_GROUP,PAT_ORDER) values ('.*B=.*?,','Egor''s original pattern "doesn''t work"','Egor',1);Insert into TEST_PATS (RE,FROM_WHOM,PAT_GROUP,PAT_ORDER) values ('.*B=.?,','Egor''s "works correctly"','Egor',2);Insert into TEST_PATS (RE,FROM_WHOM,PAT_GROUP,PAT_ORDER) values ('.*B=(.)*?,','Schemaczar Variant to force Perl operation','Egor',30);Insert into TEST_PATS (RE,FROM_WHOM,PAT_GROUP,PAT_ORDER) values ('.*B=(.*)?,','Schemaczar Variant of Egor to force POSIX','Egor',31);Insert into TEST_PATS (RE,FROM_WHOM,PAT_GROUP,PAT_ORDER) values ('.*B=.*{0,1}','Schemaczar Applying Egor''s ''non-greedy''','Egor',32);Insert into TEST_PATS (RE,FROM_WHOM,PAT_GROUP,PAT_ORDER) values ('.*B=(.)*{0,1}','Schemaczar Another variant of Egor''s "non-greedy"','Egor',33);Insert into TEST_PATS (RE,FROM_WHOM,PAT_GROUP,PAT_ORDER) values ('[^B]*B=[^Bx]*?,','Old Pro answer form 1 "good"','Egor',6);Insert into TEST_PATS (RE,FROM_WHOM,PAT_GROUP,PAT_ORDER) values ('[^B]*B=[^B]*?,','Old Pro answer form 2 "bad"','Egor',7);Insert into TEST_PATS (RE,FROM_WHOM,PAT_GROUP,PAT_ORDER) values ('.*B=.+?,','Old Pro comment 1 form 2','Egor',3);Insert into TEST_PATS (RE,FROM_WHOM,PAT_GROUP,PAT_ORDER) values ('.*B=.{0,}?,','Old Pro comment 2','Egor',5);Insert into TEST_PATS (RE,FROM_WHOM,PAT_GROUP,PAT_ORDER) values ('.+B=.*?,','Old Pro comment 1 form 1','Egor',4);Insert into TEST_PATS (RE,FROM_WHOM,PAT_GROUP,PAT_ORDER) values ('(.)*B=(.)*?,','TBone answer form 1','Egor',8);Insert into TEST_PATS (RE,FROM_WHOM,PAT_GROUP,PAT_ORDER) values ('(\w)*?@(\w)*','TBone answer example 2 form 1','TBone',9);Insert into TEST_PATS (RE,FROM_WHOM,PAT_GROUP,PAT_ORDER) values ('(\w)*@(\w)*?','TBone answer example 2 form 2','TBone',10);REM INSERTING into TEST_STRINGSSET DEFINE OFF;Insert into TEST_STRINGS (TEXT,SRC,TEXT_GROUP,TEXT_ORDER) values ('A=1,B=2,C=3,','Egor''s original source string','Egor',1);Insert into TEST_STRINGS (TEXT,SRC,TEXT_GROUP,TEXT_ORDER) values ('1_@_2_a_3_@_4_a','TBone answer example 2','TBone',2);COLUMN SRC FORMAT A50 WORD_WRAPCOLUMN TEXT FORMAT A50 WORD_WRAPCOLUMN RE FORMAT A50 WORD_WRAPCOLUMN FROM_WHOM FORMAT A50 WORD_WRAPCOLUMN RESULT FORMAT A50 WORD_WRAPSELECT * FROM REGEXP_TESTER_V;