Stalker Posted March 7, 2009 Report Share Posted March 7, 2009 If I add NOD32 (located at http://www.filehippo.com/download_nod32/) to Ketarin then it will scrape {version} as 32 AntiVirus 4.0.314. Could you please fix it Flo ? Link to comment Share on other sites More sharing options...
floele Posted March 7, 2009 Report Share Posted March 7, 2009 I'll do, if you give me a regular expression that works for all FileHippo versions, including this one. Link to comment Share on other sites More sharing options...
FranciscoR Posted March 7, 2009 Report Share Posted March 7, 2009 (edited) Now that's a classic =). From what I see I imagine that - Windows Live Messenger 2009 (14.0.8064) - Ad-Aware 2009 8.0.0.0 - Foobar2000 0.9.6.3 - 3DMark Vantage 1.0.1 etc., are also in this list. I can solve 97% of versions, but for the remaining 3% I will use "Date added". Is this of any interest ? Edited March 7, 2009 by FranciscoR Link to comment Share on other sites More sharing options...
floele Posted March 7, 2009 Report Share Posted March 7, 2009 Yep, sure. Link to comment Share on other sites More sharing options...
FranciscoR Posted March 7, 2009 Report Share Posted March 7, 2009 (edited) (?<=\<td\>.*?)(\s\(?\d+?\.\d+?.*?|[a-z]+?\s\d{1,2},\s\d{4})(?=\</td\>) It's actually more difficult than I first thought. - Windows Live Messenger 2009 (14.0.8064) = (14.0.8064) - Ad-Aware 2009 8.0.0.0 = 8.0.0.0 - Foobar2000 0.9.6.3 = 0.9.6.3 - 3DMark Vantage 1.0.1 = 1.0.1 - NOD32 AntiVirus 4.0.314 = 4.0.314 - Windows Media Player 11 = October 30, 2006 If I find a better solution I'll post it here. I'm using the technical tab to get version. Edited March 7, 2009 by FranciscoR Link to comment Share on other sites More sharing options...
Stalker Posted March 7, 2009 Author Report Share Posted March 7, 2009 Found a little problem with the regexp. If trying to match NVIDIA Forceware 182.08 WHQL XP it will return two matches: 1) 182.08 WHQL XP (note the space before 182) 2) arch 4, 2009 Link to comment Share on other sites More sharing options...
FranciscoR Posted March 8, 2009 Report Share Posted March 8, 2009 (edited) 1. For DLs such as Nvidia, Ketarin will match 182.08 WHQL XP (first match); for DLs such as Windows Media Player 11, it will match date, October 30, 2006. 2. Yeah, that aditional space it's the reason why I say "not so easy". If you have better suggestion... =) Edited March 8, 2009 by FranciscoR Link to comment Share on other sites More sharing options...
FranciscoR Posted March 8, 2009 Report Share Posted March 8, 2009 (edited) Flo, Sometimes I get a perfectly clear match in Expresso (but not in Ketarin = no red highlight for instance to http://www.filehippo.com/download_windows_media_player/tech/ ) with an expression like (?<=\<td\>[a-z].*?\s)(\(?\d+?\.\d+?.*?)(?=\</td\>)|([a-z]+?\s\d{1,2},\s\d{4}) that, btw, solves the above issue; can you verify that this is my mistake ? Edited March 8, 2009 by FranciscoR Link to comment Share on other sites More sharing options...
CybTekSol Posted March 8, 2009 Report Share Posted March 8, 2009 (edited) @Stalker, I know it sounds crazy... but I use a template even for FileHippo apps to customize the {version} scrape and set other personal preferences (path... etc.). Try this from regex on the tech tab: ((?<=\>Title:\<.*?\s)(\(?\d+?\.\d+?.*?)(?=\</[a-z]{2}\>)|(?<=\>Date\sadded:\<.*?\<[a-z]{2}\>)([a-z]+?\s\d{1,2}\,\s\d{4})(?=\</[a-z]{2}\>)) I use this in my template and it has proved to be reliable, but as I have noted before, there is no such thing as perfect when it comes to 'universal' regex for a dynamic site! Addendum: After seeing FranciscoR's post regarding Expresso vs. Ketarin, this regex works in Ketarin and fails in Expresso for Date added: on WMP11. Edited March 8, 2009 by CybTekSol Link to comment Share on other sites More sharing options...
CybTekSol Posted March 8, 2009 Report Share Posted March 8, 2009 (edited) In case anyone is wondering, this is a modified version of my FileHippo template... I do not ask for the 'Application name' in the template because it will 'auto-fill' after the template is imported simply by clicking the text field for 'FileHippo ID:', then clicking the text field for 'Application name:'. Flo... if you do not want this template posted, please delete this post. I have not posted this in the 'Template Forum' before because of the 'built-in' support for FileHippo but it does demonstrate that a template can be helpful in many other ways! <?xml version="1.0" encoding="utf-16"?> <Jobs> <ApplicationJob xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <DownloadBeta>Default</DownloadBeta> <DownloadDate xsi:nil="true" /> <VariableChangeIndicator /> <CanBeShared>true</CanBeShared> <ShareApplication>false</ShareApplication> <HttpReferer /> <Variables> <item> <key> <string>version</string> </key> <value> <UrlVariable> <VariableType>RegularExpression</VariableType> <Regex>((?<=\>Title:\<.*?\s)(\(?\d+?\.\d+?.*?)(?=\</[a-z]{2}\>)|(?<=\>Date\sadded:\<.*?\<[a-z]{2}\>)([a-z]+?\s\d{1,2}\,\s\d{4})(?=\</[a-z]{2}\>))</Regex> <Url><placeholder name="App Page URL from FileHippo?" value="http://www.filehippo.com/download_firefox/" />tech/</Url> <Name>version</Name> </UrlVariable> </value> </item> </Variables> <ExecuteCommand /> <Category><placeholder name="Category" value="FileHippo" /></Category> <SourceType>FileHippo</SourceType> <DeletePreviousFile>true</DeletePreviousFile> <Enabled>true</Enabled> <FileHippoId><placeholder name="App Page URL from FileHippo?" value="http://www.filehippo.com/download_firefox/" /></FileHippoId> <LastUpdated xsi:nil="true" /> <TargetPath><placeholder name="TargetPath" value="{target}\{category}\{appname:replace: :_}_v{version:replace: :_}.{url:ext}" /></TargetPath> <FixedDownloadUrl /> <Name /> </ApplicationJob> </Jobs> @Stalker, Try it and see if you like it... Addendum: After seeing FranciscoR's post regarding Expresso vs. Ketarin, this regex works in Ketarin and fails in Expresso for Date added: on WMP11. Edited March 8, 2009 by CybTekSol Link to comment Share on other sites More sharing options...
floele Posted March 8, 2009 Report Share Posted March 8, 2009 There can certainly be differences in regular expression parsing. And if there are, I have no influence on that. Link to comment Share on other sites More sharing options...
FranciscoR Posted March 8, 2009 Report Share Posted March 8, 2009 This is a bit strange, but if I enclose my latest regex on a big capture group w/2 non-capturing subgroups, I will get a match both in Expresso and Ketarin ((?<=\<td\>[a-z].*?\s)(?:\(?\d+?\.\d+?.*?)(?=\</td\>)|(?:[a-z]+?\s\d{1,2},\s\d{4})) The variation by CybTekSol also does the trick ((?<=\>Title:\<.*?\s)(\(?\d+?\.\d+?.*?)(?=\</[a-z]{2}\>)|(?<=\>Date\sadded:\<.*?\<[a-z]{2}\>)([a-z]+?\s\d{1,2}\,\s\d{4})(?=\</[a-z]{2}\>)) IMO, this is fixed. Flo, will any of these do ? Link to comment Share on other sites More sharing options...
FranciscoR Posted March 8, 2009 Report Share Posted March 8, 2009 Using ((?<=\<td\>[a-z].*?\s)(?:\(?\d+?\.\d+?.*?)(?=\</td\>)|(?:[a-z]+?\s\d{1,2},\s\d{4})) expect to see the following versions: Firefox 3.0.7 = 3.0.7 Yahoo! Messenger 9.0.0.2136 = 9.0.0.2136 Firefox 3.0.7 = 3.0.7 Flash Player 10.0.22.87 (IE) = 10.0.22.87 (IE) Google Chrome 1.0.154.48 = 1.0.154.48 Google Desktop 5.8.809.23506 = 5.8.809.23506 Internet Explorer 8.0 RC1 = 8.0 RC1 Maxthon 2.5.1.4751 = 2.5.1.4751 Opera 9.64 = 9.64 eMule 0.49c = 0.49c FrostWire 4.17.2 = 4.17.2 LimeWire Basic 5.1.1 = 5.1.1 Shareaza 2.4.0.0 = 2.4.0.0 uTorrent 1.8.3 Beta 14755 = 1.8.3 Beta 14755 Vuze 4.1.0.4 = 4.1.0.4 AIM 6.8.14.6 = 6.8.14.6 Google Talk 1.0.0.104 Beta = 1.0.0.104 Beta Pidgin 2.5.5 = 2.5.5 Skype 4.0.0.206 = 4.0.0.206 Thunderbird 3.0 Beta 2 = 3.0 Beta 2 Trillian 3.1.12.0 = 3.1.12.0 Windows Live Messenger 2009 (14.0.8064) = (14.0.8064) Yahoo! Messenger 9.0.0.2136 = 9.0.0.2136 CuteFTP 8.3.2 Home = 8.3.2 Home FileZilla 3.2.2.1 = 3.2.2.1 FlashGet 1.9.6.1073 = 1.9.6.1073 GMail Drive 1.0.13 = 1.0.13 Adobe Reader 9.0 = 9.0 Foxit Reader 3.0.1301 = 3.0.1301 OpenOffice.org 3.0.1 Final = 3.0.1 Final Notepad++ 5.2 = 5.2 VMware Player 2.5.1 = 2.5.1 Ad-Aware 2009 8.0.0.0 = 8.0.0.0 CWShredder 2.19 = 2.19 HijackThis 2.0.2 = 2.0.2 Rootkit Revealer 1.71 = 1.71 Spybot Search & Destroy 1.6.2 = 1.6.2 SpywareBlaster 4.1 = 4.1 Windows Defender 1.1.1593 = 1.1.1593 Comodo Firewall 3.0.25.378 = 3.0.25.378 PeerGuardian 2.0 Beta 6c = 2.0 Beta 6c Sunbelt Personal Firewall 4.6.1861 = 4.6.1861 Sygate Personal Firewall 5.6.2808 = 5.6.2808 ZoneAlarm Free 8.0.065.0 = 8.0.065.0 AntiVir Personal 8.2.00.337 = 8.2.00.337 Avast! Home Edition 4.8.1335 = 4.8.1335 AVG Free Edition 8.5.278 = 8.5.278 CCleaner 2.17.853 = 2.17.853 Recuva 1.24.399 = 1.24.399 Tweak UI 2.1 = 2.1 7-Zip 4.65 = 4.65 WinRAR 3.80 = 3.80 WinZip 12.0.8252 = 12.0.8252 3DMark Vantage 1.0.1 = 1.0.1 CPU-Z 1.50 = 1.50 Sandra Lite XII (15.72) = (15.72) Hamachi 1.0.3.0 = 1.0.3.0 RealVNC 4.1.3 = 4.1.3 Foobar2000 0.9.6.3 = 0.9.6.3 iTunes 8.0.2.20 = 8.0.2.20 K-Lite Codec Pack 4.70 (Full) = 4.70 (Full) MediaMonkey 3.1.0.1222 Beta = 3.1.0.1222 Beta QuickTime Alternative 2.8.0 = 2.8.0 QuickTime Player 7.60.92.0 = 7.60.92.0 Real Alternative 1.90 = 1.90 RealPlayer 11.0.0.581 = 11.0.0.581 Songbird 1.0.0 = 1.0.0 VLC Media Player 0.9.8a = 0.9.8a Winamp 5.55 Full = 5.55 Full Windows Media Player 11 = October 30, 2006 DAEMON Tools Lite 4.30.3 = 4.30.3 DeepBurner 1.9.0.228 = 1.9.0.228 DVD Shrink 3.2.0.15 = 3.2.0.15 ImgBurn 2.4.2.0 = 2.4.2.0 Nero Burning Rom 9.2.6.0 = 9.2.6.0 ObjectDock 1.9 = 1.9 RocketDock 1.3.5 = 1.3.5 Samurize 1.64.3 = 1.64.3 WindowBlinds 6.4 = 6.4 Yahoo! Widget Engine 4.5.1 = 4.5.1 FastStone Image Viewer 3.7 = 3.7 IrfanView 4.23 = 4.23 Paint.NET 3.36 = 3.36 Picasa 3.1 Build 70.73 = 3.1 Build 70.73 .NET Framework Version 3.5 SP1 = 3.5 SP1 ATI Catalyst Drivers 9.2 XP = 9.2 XP DirectX 9.0c (Nov 08) = 9.0c (Nov 08) IntelliPoint 6.3 = 6.3 IntelliType Pro 6.3 = 6.3 Java Runtime Environment 1.6.0.12 = 1.6.0.12 NVIDIA Forceware 182.08 WHQL XP = 182.08 WHQL XP Link to comment Share on other sites More sharing options...
floele Posted March 8, 2009 Report Share Posted March 8, 2009 Looks good I'd say, thanks Link to comment Share on other sites More sharing options...
Stalker Posted March 8, 2009 Author Report Share Posted March 8, 2009 Thanks guys. Works like a charm. Link to comment Share on other sites More sharing options...
FranciscoR Posted March 10, 2009 Report Share Posted March 10, 2009 Actually I found a bug. =) ((?<=\>Title:\<.*?\s)(?:\(?\d+?\.\d+?.*?)(?=\</td\>)|(?:[a-z]+?\s\d{1,2},\s\d{4})) Link to comment Share on other sites More sharing options...
floele Posted March 10, 2009 Report Share Posted March 10, 2009 Let me know if you think this is final Link to comment Share on other sites More sharing options...
FranciscoR Posted March 10, 2009 Report Share Posted March 10, 2009 There's no 'final' regex, to quote CybTekSol. But yes, I cannot find other errors for the moment. =D Link to comment Share on other sites More sharing options...
CybTekSol Posted March 10, 2009 Report Share Posted March 10, 2009 Actually I found a bug. =)Can you tell me which app the regex failed with as I have not experienced any issues with mine YET. Would be nice to know. Link to comment Share on other sites More sharing options...
FranciscoR Posted March 11, 2009 Report Share Posted March 11, 2009 Yours is OK that's why I am now using your prefix. I didn't test everything all over again after Stalker comment, 7-zip, 3dmark, .NET where capturing date. Before I realized I could put the space before version to work, I started testing with a-z and later on I forgot to remove it. =) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now