appyface Posted July 16, 2011 Report Share Posted July 16, 2011 Has anyone been successful scraping and downloading from m$ Hardware site since their last redesign? Here's one example: http://www.microsoft.com/hardware/en-us/d/wireless-laser-mouse-5000 From this page, in one Ketarin app I want to scrape and download both the Win7 32-bit version, in another app the Win7 64-bit version. If anyone has figured this out, I would appreciate a template or some helpful pointers :-) A few times I thought I had it worked out, but ... no. Kind regards, --appyface Please note, I'm specifically asking about downloading from this page/site. I'm not looking for alternative download URLs. Link to comment Share on other sites More sharing options...
shawn Posted July 17, 2011 Report Share Posted July 17, 2011 Here's a working download for the x86 and x64 versions of both IntelliType and IntelliPoint: <?xml version='1.0' encoding='utf-8'?> <Jobs> <ApplicationJob xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Guid="e980935e-2e08-4afa-a3dc-50782ee06612"> <Category>Drivers</Category> <WebsiteUrl>http://www.microsoft.com/hardware/mouseandkeyboard/intellipoint/readme</WebsiteUrl> <UserAgent /> <UserNotes>IntelliPoint is the driver system for Microsoft mouse products.</UserNotes> <LastFileSize>27872656</LastFileSize> <LastFileDate>2011-04-13T19:13:30-07:00</LastFileDate> <IgnoreFileInformation>false</IgnoreFileInformation> <DownloadBeta>Default</DownloadBeta> <DownloadDate xsi:nil="true" /> <CheckForUpdatesOnly>false</CheckForUpdatesOnly> <VariableChangeIndicator /> <CanBeShared>true</CanBeShared> <ShareApplication>false</ShareApplication> <ExclusiveDownload>false</ExclusiveDownload> <HttpReferer /> <SetupInstructions /> <Variables> <item> <key> <string>version</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>RegularExpression</VariableType> <Regex>.+/IPx86_1033_([\d\.]+)\.exe.+?</Regex> <Url>http://www.microsoft.com/hardware/en-us/d/comfort-optical-mouse-3000</Url> <Name>version</Name> </UrlVariable> </value> </item> <item> <key> <string>dl</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>RegularExpression</VariableType> <Regex>"(http://download.microsoft.com/download/[^"]+/IPx86_1033_{version}.exe)"</Regex> <Url>http://www.microsoft.com/hardware/en-us/d/comfort-optical-mouse-3000</Url> <Name>dl</Name> </UrlVariable> </value> </item> <item> <key> <string>swebsite</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>Textual</VariableType> <Regex /> <TextualContent>http://www.microsoft.com/hardware/en-us/downloads</TextualContent> <Name>swebsite</Name> </UrlVariable> </value> </item> <item> <key> <string>snotes</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>Textual</VariableType> <Regex /> <TextualContent /> <Name>snotes</Name> </UrlVariable> </value> </item> <item> <key> <string>schangelog</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>Textual</VariableType> <Regex /> <TextualContent>http://www.microsoft.com/hardware/intellipoint/en-us/default.mspx</TextualContent> <Name>schangelog</Name> </UrlVariable> </value> </item> </Variables> <ExecuteCommand /> <ExecutePreCommand /> <ExecuteCommandType>Batch</ExecuteCommandType> <ExecutePreCommandType>Batch</ExecutePreCommandType> <SourceType>FixedUrl</SourceType> <PreviousLocation>C:\Users\User\Desktop\Ketarin\.\Drivers\MS_IntelliPoint_x86-8.15.406.0.exe</PreviousLocation> <DeletePreviousFile>true</DeletePreviousFile> <Enabled>true</Enabled> <FileHippoId /> <LastUpdated>2011-07-16T18:43:50.7386767-07:00</LastUpdated> <TargetPath>.\{category}\{appname:regexreplace:([\s\t\r\n\-\\&]+):_}-{version}.{url:ext}</TargetPath> <FixedDownloadUrl>{dl}</FixedDownloadUrl> <Name>MS IntelliPoint x86</Name> </ApplicationJob> <ApplicationJob xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Guid="25364dea-d197-410d-9a34-7bb7710a37c9"> <Category>Drivers</Category> <WebsiteUrl>http://www.microsoft.com/hardware/mouseandkeyboard/intellipoint/readme</WebsiteUrl> <UserAgent /> <UserNotes>IntelliPoint is the driver system for Microsoft mouse products.</UserNotes> <LastFileSize>30307728</LastFileSize> <LastFileDate>2011-04-13T19:13:31-07:00</LastFileDate> <IgnoreFileInformation>false</IgnoreFileInformation> <DownloadBeta>Default</DownloadBeta> <DownloadDate xsi:nil="true" /> <CheckForUpdatesOnly>false</CheckForUpdatesOnly> <VariableChangeIndicator /> <CanBeShared>true</CanBeShared> <ShareApplication>false</ShareApplication> <ExclusiveDownload>false</ExclusiveDownload> <HttpReferer /> <SetupInstructions /> <Variables> <item> <key> <string>version</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>RegularExpression</VariableType> <Regex>.+/IPx64_1033_([\d\.]+)\.exe.+?</Regex> <Url>http://www.microsoft.com/hardware/en-us/d/comfort-optical-mouse-3000</Url> <Name>version</Name> </UrlVariable> </value> </item> <item> <key> <string>dl</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>RegularExpression</VariableType> <Regex>"(http://download.microsoft.com/download/[^"]+/IPx64_1033_{version}.exe)"</Regex> <Url>http://www.microsoft.com/hardware/en-us/d/comfort-optical-mouse-3000</Url> <Name>dl</Name> </UrlVariable> </value> </item> <item> <key> <string>swebsite</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>Textual</VariableType> <Regex /> <TextualContent>http://www.microsoft.com/hardware/en-us/downloads</TextualContent> <Name>swebsite</Name> </UrlVariable> </value> </item> <item> <key> <string>snotes</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>Textual</VariableType> <Regex /> <TextualContent /> <Name>snotes</Name> </UrlVariable> </value> </item> <item> <key> <string>schangelog</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>Textual</VariableType> <Regex /> <TextualContent>http://www.microsoft.com/hardware/intellipoint/en-us/default.mspx</TextualContent> <Name>schangelog</Name> </UrlVariable> </value> </item> </Variables> <ExecuteCommand /> <ExecutePreCommand /> <ExecuteCommandType>Batch</ExecuteCommandType> <ExecutePreCommandType>Batch</ExecutePreCommandType> <SourceType>FixedUrl</SourceType> <PreviousLocation>C:\Users\User\Desktop\Ketarin\.\Drivers\MS_IntelliPoint_x64-8.15.406.0.exe</PreviousLocation> <DeletePreviousFile>true</DeletePreviousFile> <Enabled>true</Enabled> <FileHippoId /> <LastUpdated>2011-07-16T18:36:33.0126402-07:00</LastUpdated> <TargetPath>.\{category}\{appname:regexreplace:([\s\t\r\n\-\\&]+):_}-{version}.{url:ext}</TargetPath> <FixedDownloadUrl>{dl}</FixedDownloadUrl> <Name>MS IntelliPoint x64</Name> </ApplicationJob> <ApplicationJob xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Guid="9642afa4-04c9-4f13-8f52-3dbbdcc2f405"> <Category>Drivers</Category> <WebsiteUrl>http://www.microsoft.com/hardware/mouseandkeyboard/intellitype/readme</WebsiteUrl> <UserAgent /> <UserNotes>IntelliType is the driver system for Microsoft keyboard products.</UserNotes> <LastFileSize>13843856</LastFileSize> <LastFileDate>2011-04-13T19:14:15-07:00</LastFileDate> <IgnoreFileInformation>false</IgnoreFileInformation> <DownloadBeta>Default</DownloadBeta> <DownloadDate xsi:nil="true" /> <CheckForUpdatesOnly>false</CheckForUpdatesOnly> <VariableChangeIndicator /> <CanBeShared>true</CanBeShared> <ShareApplication>false</ShareApplication> <ExclusiveDownload>false</ExclusiveDownload> <HttpReferer /> <SetupInstructions /> <Variables> <item> <key> <string>version</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>RegularExpression</VariableType> <Regex>.+/ITPx86_1033_([\d\.]+)\.exe.+?</Regex> <Url>http://www.microsoft.com/hardware/en-us/d/natural-ergonomic-keyboard-4000</Url> <Name>version</Name> </UrlVariable> </value> </item> <item> <key> <string>dl</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>RegularExpression</VariableType> <Regex>"(http://download.microsoft.com/download/[^"]+/ITPx86_1033_{version}.exe)"</Regex> <Url>http://www.microsoft.com/hardware/en-us/d/natural-ergonomic-keyboard-4000</Url> <Name>dl</Name> </UrlVariable> </value> </item> <item> <key> <string>swebsite</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>Textual</VariableType> <Regex /> <TextualContent>http://www.microsoft.com/hardware/en-us/downloads</TextualContent> <Name>swebsite</Name> </UrlVariable> </value> </item> <item> <key> <string>snotes</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>Textual</VariableType> <Regex /> <Name>snotes</Name> </UrlVariable> </value> </item> <item> <key> <string>schangelog</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>Textual</VariableType> <Regex /> <TextualContent>http://www.microsoft.com/hardware/intellitype/en-us/default.mspx</TextualContent> <Name>schangelog</Name> </UrlVariable> </value> </item> </Variables> <ExecuteCommand /> <ExecutePreCommand /> <ExecuteCommandType>Batch</ExecuteCommandType> <ExecutePreCommandType>Batch</ExecutePreCommandType> <SourceType>FixedUrl</SourceType> <PreviousLocation>C:\Users\User\Desktop\Ketarin\.\Drivers\MS_IntelliType_x86-8.15.406.0.exe</PreviousLocation> <DeletePreviousFile>true</DeletePreviousFile> <Enabled>true</Enabled> <FileHippoId /> <LastUpdated>2011-07-16T18:42:52.8833676-07:00</LastUpdated> <TargetPath>.\{category}\{appname:regexreplace:([\s\t\r\n\-\\&]+):_}-{version}.{url:ext}</TargetPath> <FixedDownloadUrl>{dl}</FixedDownloadUrl> <Name>MS IntelliType x86</Name> </ApplicationJob> <ApplicationJob xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Guid="f0bb50da-5d91-4c85-bfd7-82a1eba057f0"> <Category>Drivers</Category> <WebsiteUrl>http://www.microsoft.com/hardware/mouseandkeyboard/intellitype/readme</WebsiteUrl> <UserAgent /> <UserNotes>IntelliType is the driver system for Microsoft keyboard products.</UserNotes> <LastFileSize>16268176</LastFileSize> <LastFileDate>2011-04-13T19:13:16-07:00</LastFileDate> <IgnoreFileInformation>false</IgnoreFileInformation> <DownloadBeta>Default</DownloadBeta> <DownloadDate xsi:nil="true" /> <CheckForUpdatesOnly>false</CheckForUpdatesOnly> <VariableChangeIndicator /> <CanBeShared>true</CanBeShared> <ShareApplication>false</ShareApplication> <ExclusiveDownload>false</ExclusiveDownload> <HttpReferer /> <SetupInstructions /> <Variables> <item> <key> <string>version</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>RegularExpression</VariableType> <Regex>.+/ITPx64_1033_([\d\.]+)\.exe.+?</Regex> <Url>http://www.microsoft.com/hardware/en-us/d/natural-ergonomic-keyboard-4000</Url> <Name>version</Name> </UrlVariable> </value> </item> <item> <key> <string>dl</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>RegularExpression</VariableType> <Regex>"(http://download.microsoft.com/download/[^"]+/ITPx64_1033_{version}.exe)"</Regex> <Url>http://www.microsoft.com/hardware/en-us/d/natural-ergonomic-keyboard-4000</Url> <Name>dl</Name> </UrlVariable> </value> </item> <item> <key> <string>swebsite</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>Textual</VariableType> <Regex /> <TextualContent>http://www.microsoft.com/hardware/en-us/downloads</TextualContent> <Name>swebsite</Name> </UrlVariable> </value> </item> <item> <key> <string>snotes</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>Textual</VariableType> <Regex /> <Name>snotes</Name> </UrlVariable> </value> </item> <item> <key> <string>schangelog</string> </key> <value> <UrlVariable> <RegexRightToLeft>false</RegexRightToLeft> <VariableType>Textual</VariableType> <Regex /> <TextualContent>http://www.microsoft.com/hardware/intellitype/en-us/default.mspx</TextualContent> <Name>schangelog</Name> </UrlVariable> </value> </item> </Variables> <ExecuteCommand /> <ExecutePreCommand /> <ExecuteCommandType>Batch</ExecuteCommandType> <ExecutePreCommandType>Batch</ExecutePreCommandType> <SourceType>FixedUrl</SourceType> <PreviousLocation>C:\Users\User\Desktop\Ketarin\.\Drivers\MS_IntelliType_x64-8.15.406.0.exe</PreviousLocation> <DeletePreviousFile>true</DeletePreviousFile> <Enabled>true</Enabled> <FileHippoId /> <LastUpdated>2011-07-16T18:43:53.9378597-07:00</LastUpdated> <TargetPath>.\{category}\{appname:regexreplace:([\s\t\r\n\-\\&]+):_}-{version}.{url:ext}</TargetPath> <FixedDownloadUrl>{dl}</FixedDownloadUrl> <Name>MS IntelliType x64</Name> </ApplicationJob> </Jobs> Link to comment Share on other sites More sharing options...
shawn Posted July 17, 2011 Report Share Posted July 17, 2011 For specific model support (though they're the same on all, as far as I can tell), you can change the URL used in the version & dl variables. Link to comment Share on other sites More sharing options...
appyface Posted July 17, 2011 Author Report Share Posted July 17, 2011 Thank you Shawn, I'll check these out :-) Kind regards, --appyface Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now