Jump to content
Ketarin forum

shawn

Moderators
  • Posts

    1,181
  • Joined

  • Last visited

Everything posted by shawn

  1. Hi, Flo! I see headers for content-length and content-type, but not content-disposition, which is the one that really matters if you're using alternate streams: C:\Tools>gethie http://ketarin.canneverbe.com/download GET http://ketarin.org/downloads/Ketarin/Ketarin-1.1.2.337.zip?noredirect User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322 ; .NET CLR 2.0.50727) GET http://ketarin.canneverbe.com/download --> 302 Found GET http://ketarin.canneverbe.com/downloads/Ketarin/Ketarin-1.1.2.337.zip --> 30 2 Found GET http://ketarin.org/downloads/Ketarin/Ketarin-1.1.2.337.zip?noredirect --> 20 0 OK Connection: close Date: Wed, 11 Aug 2010 00:17:58 GMT Accept-Ranges: bytes ETag: "179e0da-b6b04-48b058139f880" Server: Apache/2.2.16 (Linux/SUSE) Content-Length: 748292 Content-Type: application/zip Last-Modified: Sat, 10 Jul 2010 09:54:26 GMT Client-Date: Wed, 11 Aug 2010 00:17:57 GMT Client-Peer: 78.46.64.174:80 Client-Response-Num: 1
  2. Frankly, I don't see how this would be any different than what's already available with "split" on the version string - other than the added complexity of the string names (major, minor, build and so on).
  3. @Josh, I'd like to make a recommendation about your regex patterns: Version.+?(\d\.\d+)\. This should really, at a minimum, be: Version.+?(\d+\.\d+)\. This will ensure that the moment CCleaner 10.0 is released, your template continues to function. I've seen several templates using similar single-digit or digit-limiting patterns in the past break because of an assumed length in the major, minor or build integers - and it's far more reliable to not rely on a specific length. I personally would have used something like this to get the version: Version[^<>]+?([\d\.]+)\. *But* since the filename is dependent upon a version string that's impossible to truly predict (major.minor will NOT work once 3.0 is released!), I would instead find the direct URL on the page and consume that in another variable, passing that into the download field. For this one, I'd use something like (pulled from */ccleaner/standard/): ["'](http\://download\.piriform\.com/[^'"]+\.exe)["'\?&] I use the ["'] and ["'\?&] wrappers because I've experienced a number of times where the site switches between quoting styles, or implements something like Google Analytics with download tracking (see the recent changes on code.google.com for a great example of justification). It's much easier to predict the truly acceptable values than it is to debug and then rewrite the template each time a change is made on the target site.
  4. Yes, it is possible (I'm using a very similar function for one of my sites for local clients), but it's not a small endeavor - and the way this software is designed is rather system-agnostic. You could, of course, use the source to change the way it operates to be exclusive to downloading through your site, assuming you had the space, bandwidth, and reliability to ensure that something like that could happen. FileHippo is smart enough to know to use the original file sources whenever possible, to reduce their own costs, bandwidth and increase server performance.
  5. That was just a demonstration page to show how Slices could be effective for downloading cool apps. They're in use in the wild, and having slices "enabled" would literally make that portion of the page "pop" for you.
  6. Unfortunately, none of those are solutions, just restating the problem in different ways. First, I don't really care what the {root} is - I don't want to have to rely on any potentially static information (such as a subfolder or drive letter or other issue) - so "{root}\Path" is simply nonsensical - it requires me to fiddle with it based on how the application is run. That's a fail. And, since {startuppath}, as a function and not a variable, isn't populated in the "command" functionality, I can't use it the way I need to. In order to accomplish what I'm after directly, it looks like I'm going to need to edit *every* template on my list to add a variable that just parses the contents of {startuppath} to relay the information to the command. This would result in bloated templates, duplicated code running unnecessarily several hundred times, and incompatibility with the online database. I'd like to avoid all that if possible. I guess I just don't see more than a couple good reasons for {startuppath} to even be used on the templates themselves (direct extraction or execution being the only justification). And I see no reason for it to be lacking implementation in the global functionality.
  7. shawn

    Backslash in regex

    B@T, I think you're using a flawed assumption. My understanding was that from a variable with a value like this "c:\user\me\stuff\files.txt" you were trying to return only "files.txt". By default, Regex isn't multiline, so an exclusion ([^\\]) wouldn't return anything if your variable had a carriage return or line feed within it. Consider passing it thru a "trim" before you attempt the regex grab. Or, alternatively, use optional carriage return/line feeds as possible terminators in the grab: {file:regex:[^\\]+[\r\n\s\t]*$}
  8. da023n, are you sure that 7zip is located in the "path" variable? If not, this is most likely what is causing your problems.
  9. Note also that there is an online database of templates which are free to use as a guide. Click the drop-down option next to "add new application" in the bottom left corner of Ketarin and select "import from online database". Search for what you want, or just hit "top 50" to see the most popular templates.
  10. In these situations, I usually use Split. First capture the "version" as the number above (2.34.1200), then generate the download filename using: csetup{version:split:0:.}{version:split:1:.}.exe This method will work fine UNTIL they change the download filenames to use a different pattern - and they will inevitably do that eventually.
  11. Is there a way within the "commands" functionality to access the "startuppath"? I've tried using {startuppath}. Doesn't work. I've tried using a global variable which is assigned to {startuppath}. That doesn't work. I've tried using ".\", but that's unreliable based on how Ketarin is started (it's treated as the active directory instead of the application path). Ideally, I'd have the ability to simply use {startuppath} within commands, since what I'm after is a way to replace this type of thing: echo {category} /// {appname} {version}>> "%HOMEPATH%\Desktop\Ketarin\Updates.txt" With this: echo {category} /// {appname} {version}>> "{startuppath}\Updates.txt" This would make Ketarin more portable for me. There are times I need to be able to run it across networks (which means there's a different path) or through shell scripts by another user account. In these situations, my options appear to be to either hard-code the path, or duplicate the folder so that the settings can be unique to the specific instance. I would like to avoid duplication as much as possible, of course.
  12. necrox - According to the documentation, root is the root of the drive Ketarin is running on - and that's how it behaves in my scripts, too.
  13. You're missing the point - Slices are not about "IE", they're about trimming the fat in large pages for only the important information. If you can easily identify a consistent block of meaningful code based on an elements ID, through an existing standard - it should be a very simple implementation in the core. While "slices" are the name MS uses in IE for this functionality, URLs with named anchors have existed as long as HTML itself - the only significant difference between this and named anchors in general is that under "slices" the object with the associated ID is not only an inline reference but a container - which means that the agent/browser/whatever would consume the entire codeblock that relates to the ID and nothing more. And, since Ketarin is already parsing the content, I'm pretty confident that it would be a relatively small change to add the ability to parse the results for a named ID or other selector...but the ability to convey that with an accepted definition ("slices" or "named anchors") would make describing the feature far more effective. I've come across the need for something like this a few times, but have coded around it the long way. It would be far more useful to only have to parse what's absolutely necessary in the returned HTML. On a related note, it sure would be nifty to have the ability to use XSL selectors in general, too - but I'd be perfectly satisfied (for now) with slices.
  14. I don't want to fan the flames of a browser war - there are plenty of reasons to avoid each vendors browser - but the fact is, IE has the majority of the Internet market, so any website not designed to work correctly within IE is a waste of webspace. The problem is most likely related to how the redirects are occuring. There doesn't appear to be a header indicating the filename, and this is what IE uses (and the HTTP standard requires, by the way) for indicating file names for download. Browsers are free to interpret the correct saveas name from any resource thru the URLs provided - IE uses the original filename, Firefox uses the redirect path. The only "right" way is to include the content-disposition header to forcefully tell the browser what the "correct" filename is. This should be a header like so: Content-Disposition: attachment; filename=Ketarin-1.1.2.337.zip; size=748292; This tells the browser the file size (for progress bar support), the file name (to indicate what the filename should be in a saveas dialog), and "attachment" which effectively means "show a saveas dialog", as well as other optional information such as dates. It's also important to know that the Content-Type header is used to indicate the *type* of file, but is not firmly tied to a file extension under Windows (or any other OS). More about Content-Disposition here: http://www.apps.ietf.org/rfc/rfc2183.html
  15. My vote is to make whatever changes are necessary to minimize or eliminate the spam. I know Ketarin isn't a revenue-generator for you *yet*, but the more users it has, the more likely it is to become so. Getting your support sites tied to "bad neighborhoods" in the search results is the best way to disappear from relevant search results, making it harder for those who *want* something like Ketarin to find it.
  16. reference: http://ketarin.canneverbe.com/forum/viewtopic.php?pid=3654#p3654 Okay, this gives me totally mental ideas here. But I need to know a little more about how Ketarin behaves. Does Ketarin ALWAYS hit every URL in each variable of the "content from URL" types? That is, even if the VERSION is the same, does Ketarin keep trying the other variables and URLs? Is there a point within the process of enumerating variables where Ketarin stops processing variables, for example, if they don't programatically appear to perform any required need within the version check, commands, or file naming? That is, if I create a variable named "ignore" and assign it to 'content from URL' with the following URL: http://example.com/parsed/{appname}/{version} Having the page set to simply return a couple words or something, but I discard the result (not used in ANY way at all)...will the URL still be hit? Will it only be hit if the version is different than previous? I'm sorry that I'm probably making this more complicated than it is, but this could REALLY reduce the amount of manual labor required to work on one of my sites, and that would be WONDERFUL.
  17. {root} won't work, since that's the root of the drive. I'm trying to programmatically export my own tailored log to a text file I use for monitoring changes. I would like to be able to make it as portable as possible, which means it needs to write to a child folder of the Ketarin running instance. It won't always be the same path (sometimes it'll be run through a network, mapped drive, USB key, or other methods). I wanted to use ".\" but I'm afraid of the potential for the "." to be misinterpreted as a different path on various operating systems.
  18. I have several sites where similar behaviors are required, so I coded a way around them using my own site as the version checker. I coded up a page in PHP that determines the 'rules' for the incrementation, then increment it on those events. For your situation, I would use a date_add to get the same functionality. The version doesn't really have to mean anything, just be consistent with previous versioning, so it would effectively allow you to get the content correctly only when the specific dates have changed.
  19. This would be so helpful to me - and I would gladly rewrite (and share, of course) all of my templates from the download sites.
  20. I guess I should have stated that you have to visit the URL above using Internet Explorer 8 or newer with the "Slices" functionality enabled. Slices are an answer to the problem of sites that do not provide a legitimate or useful RSS feed, but they do provide other content in a consistent fashion. In the site above, the "sliced" section is the changelog (there didn't used to be an RSS feed, btw). Using a URL with an anchor reference, as so: http://www.bullzip.com/products/pdf/info.php#slice_versionhistory ...would effectively strip off all text before the object with that id. This would result in a much smaller code sample to have to parse for your RegEx pattern or before/after links. In the link above, it would effectively remove everything outside of this object: <div class="hslice" id="slice_versionhistory"> ... </div> Thus, a page that's 70kb of text with numerous repeated phrases becomes 30kb, with much less duplication, which reduces the risk of "selecting" the wrong content.
  21. AWESOME! I can't wait 'til tomorrow when I get to find and then rewrite a half dozen searches. This is going to save me so much time and variables.
  22. This was from a previous version - it's been working fine for a month or more.
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.