+ Reply to Thread
Results 1 to 3 of 3

regex match multiple words in a line

  1. regex match multiple words in a line

    Hi,

    I am trying to construct a regexp in mysql that looks for a multiple
    word match in one line of text within other lines of text. When the
    text is inserted into a field, all lines are already separated by
    '\n'.

    I can only get it work when it looks for one word in a line as such:
    select * from sp where lines regexp('\n.*[[:<:]]word1[[:>:]]+.*[^\n]+
    \n');


    When I look for two words in a line it only returns lines that have
    the search words in that order. For example:

    select * from sp where lines
    regexp('\n.*[[:<:]]word1[[:>:]].*[[:<:]]word2[[:>:]]+.*[^\n]+\n');
    works when words are in the order of 'word1 ... word2' but not when
    words are in the order of 'word2 ... word1'.

    Also I am having trouble searching only the first line of text. I have
    been working with the following but it still doesn't work properly:
    select * from sp where lines
    regexp('^.*[[:<:]]word1[[:>:]].*[[:<:]]word2[[:>:]]+.*[^\n]+\n');

    Thanks,

    Carl


  2. Re: regex match multiple words in a line

    Overall, MySQL regular expression support is pretty weak. You may end
    up needing to something like:

    select * from sp where
    lines regexp('\n.*[[:<:]]word1[[:>:]].*[[:<:]]word2[[:>:]]+.*[^\n]+
    \n') OR
    lines regexp('\n.*[[:<:]]word2[[:>:]].*[[:<:]]word1[[:>:]]+.*[^\n]+
    \n');


    Which you may want to compare against:

    select * from sp where lines LIKE "%word1%word2%" OR lines LIKE
    "%word2%word1%"

    To see which one is faster.

    I ended up recompiling MySQL with plugin support and compiling UDF-
    regexp 1.0 from http://udf-regexp.php-baustelle.de/trac/.. which
    provides great regular expression support for more than just checking
    for simple matches.


  3. Re: regex match multiple words in a line

    On 26 Nov, 12:43, Michael Martinek wrote:
    > Overall, MySQL regular expression support is pretty weak. You may end
    > up needing to something like:
    >
    > select * from sp where
    > lines regexp('\n.*[[:<:]]word1[[:>:]].*[[:<:]]word2[[:>:]]+.*[^\n]+
    > \n') OR
    > lines regexp('\n.*[[:<:]]word2[[:>:]].*[[:<:]]word1[[:>:]]+.*[^\n]+
    > \n');
    >
    > Which you may want to compare against:
    >
    > select * from sp where lines LIKE "%word1%word2%" OR lines LIKE
    > "%word2%word1%"
    >
    > To see which one is faster.
    >
    > I ended up recompiling MySQL with plugin support and compiling UDF-
    > regexp 1.0 fromhttp://udf-regexp.php-baustelle.de/trac/.. which
    > provides great regular expression support for more than just checking
    > for simple matches.


    However the "words in any order" problem is one that can only be
    solved in very advanced REGEX implementations ISTR.

+ Reply to Thread