Jump to content


Photo

Carriage Return/Linefeed or Linefeed/Carriage Return?


4 replies to this topic

#1 appyface

appyface
  • Members
  • 463 posts

Posted 27 March 2017 - 01:21 AM

Using Ketarin 1.8.7.

 

I hope someone can help to explain what I'm seeing.

 

First, please look at this example:

 

1. Create ketarin variable "vers", load this URL: http://www.piriform....leaner/download

2. Regex: release notes.*?<strong>.*?v(.{1,10})\n

 

This (today) populates "vers" with 5.28.6005

Note the linefeed match.

 

 

Now, copy/paste the page source directly from the Ketarin window into a hex editor or other editor that can visualize carriage returns and linefeeds.

 

When I paste this source into EditPad Pro (my usual editor) and display it hex mode, what follows the 5.28.6005 in the source is a x0D x0A -  carriage return/linefeed.

 

I am assuming Ketarin is correct and the source really is LF/CR.  Or is Ketarin mis-interpreting the source?

 

I've not had any problem in the past with EditPad Pro showing terminators incorrectly, it supports all the variations, but I'll try another hex editor anyway.

 

Assuming for the moment that EditPad Pro isn't at fault here either, could windows be converting the terminators in the clipboard to standard windows CR/LF sequence?  I'm using Win 10 Pro.  Thoughts?



#2 shawn

shawn
  • Moderators
  • 800 posts

Posted 27 March 2017 - 02:01 AM

Windows sometimes converts LF+CR to CR+LF when copied to the clipboard.

 

That said, I tested the URL in several different clients and got CRLF in all of them. 

 

That said, regexp "\n" can match not only CR but all combinations of CRLF, LFCR, LF and CR. It can also match instances of x09 (tab) and x0B (vertical tab) and \u0085 (unicode new line). It doesn't merely mean "new line", but "characters commonly interpreted as a new line", based on the specific RegExp client.



#3 appyface

appyface
  • Members
  • 463 posts

Posted 27 March 2017 - 02:11 AM

Thanks Shawn for confirming what I suspected with windows clipboard. 

 

Which Regex engine is Ketarin using?  .Net  or  ?



#4 MAPJe71

MAPJe71
  • Members
  • 28 posts
  • LocationThe Netherlands

Posted 27 March 2017 - 10:00 PM

AFAIK it's a .NET application coded in C# not using any 3rd party libraries for regular expressions.



#5 appyface

appyface
  • Members
  • 463 posts

Posted 28 March 2017 - 01:33 PM

I think that's right, too.  Thank you!





Reply to this topic



  


0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users