Jump to content
Ketarin forum

Ketarin 1.0.7.272 goes to 100% of one CPU and hangs


appyface
 Share

Recommended Posts

1.0.7.272 This is current version of Ketarin, yes?

 

Try this exactly as shown:

 

1. Launch Ketarin directly by double-click EXE from windows explorer (and use jobs.db in same directory)

 

2. Double-click app to edit it (choose any having at least one variable using regex)

 

3. Click 'variables' button

 

4. Click on variable name to load it

 

5. Paste this regex over top of your regex: (?<=download-box.*?href=").*?(?=")

 

6. Click in this regex to position cursor between href=" and the ) which follows it

 

7. Type this to add to the regex at that position: .*?

 

Inserting the .* characters works, then as soon as you type the ? character, Ketarin hangs. On my system it ties up 100% of one CPU core. No way out except to kill Ketarin.

 

 

The above sequence recreates the hang no matter what jobs.db and what app. Actually, the app doesn't have to be using a regex or variable, this sequence recreates the hang even if I first add a variable and it to use regex, then paste the regex in.

 

@Flo if you cannot recreate please let me know I'll send you my jobs.db and/or get debug information for you, to find out what is different on my system.

 

Thanks and regards,

--appyface

Link to comment
Share on other sites

Lazy dots (.*?) like lazy quantifiers in general, are known to be problematic due to backtracking, not sure if Flo can do anything about this (but if he does, better =)). Jan Goyvaerts, for instance, shows you that the regex '^(.*?,){11}P' needs 29,685 steps to fail to match on

 

1,2,3,4,5,6,7,8,9,10,11,12

 

If you add 3 aditional characters to content, it will take 60,313 steps to fail

 

1,2,3,4,5,6,7,8,9,10,11,12,13

 

Now if you enclose the lazy dot (.*?) in a atomic group like '^(?>(.*?,){11})P', the regex needs only 63 steps to fail. Keep improving the regex '^(?>([^,\r\n]*,){11})P' and you need only 37 steps to fail, whether you use

 

1,2,3,4,5,6,7,8,9,10,11,12,13
or
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19

 

See http://www.regular-expressions.info/catastrophic.html

 

BTW, i posted this first ;)http://ketarin.canneverbe.com/forum/viewtopic.php?id=240

Edited by FranciscoR
Link to comment
Share on other sites

Why Ketarin is trying to process it yet? To me I am still editing, and I was definitely not done... there will be no catastrophic backtracking when I'm done...

 

Can Ketarin NOT parse a regex until I'm ready to ask it to do so (ctrl-G or ?)? .*? is not an unusual edit especially if that is not all there is... but Ketarin wouldn't let me get that far.

 

--appyface

Link to comment
Share on other sites

With an 'Edit' checkmark ?

 

This way you could keep existing behaviour while disabling it when required (as appyface suggested).

I agree with this or something similar Flo such as 'Disable Regex Processing' that is NOT sticky so that it can't be accidentally left that way when exiting the regex edit window... can it be done without a lot of coding as I too have been bitten by this phenomena?

Link to comment
Share on other sites

Hi Flo,

 

These are good suggestions, and would leave the existing functionality the way it is as you have asked.

 

Another viewpoint. Would it be possible to alter the current functionality slightly, to require a keypress or a button, to evaluate the regex?

 

I am thinking of a solution similar to the 'load' and 'find' buttons there now, for their respective boxes. This leaves much of Ketarin's existing functionality intact but changes the editing functionality slightly. With the URL and and 'find string' boxes, typing a change in their boxes doesn't automatically load a new URL or find the string, the buttons must be clicked to request it. For regex I am thinking when when an *existing* regex is being edited, then the auto-eval is not in effect, it will require a button click or a keypress, to request the re-evaluation of the contents of the regex box. I am a stickler for consistency so this seems intuitive to me as it is reasonbly consistent with the behavior of the other two boxes. Just another viewpoint.

 

Another idea: Is it possible to make the call to the .net regex library with an event interrupt handler? As you know I've not worked with the .net languages -- in other languages I've used, an object-oriented call to a 3rd party function can include an interrupt event handler "hook" so that an abort request can be sent, the 3rd party function will terminate and return to the caller.

 

If the regex appears 'hung' the person could click an 'abort regex eval' button to kill what appears to be a 'hung' task. Once that happens the button's label could change to 'Eval regex' as once aborted, this still leaves the problem of needing to wait until the edit finished, to start the eval again.

 

Regardless, I'm sure you'll come up with something useful, Flo... thanks as always for your consideration.

 

--appyface

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.