Discussion in 'Computing' started by Rod Speed, Jun 19, 2004.

  1. Rod Speed

    Rod Speed Guest

    Have a look at http://www.foxtel.com.au/TVGuide.aspx
    Select a channel in the left hand box, say discovery, hit GO.

    I want to import the data from the web page.

    I used to be able to until they changed the format recently.

    I now cant even see the program info with a view source.

    There appears to be a block of encrypted stuff down the
    bottom which is about the right size for the program data.

    Anyone got any ideas about how to unencrypt it ?
    Rod Speed, Jun 19, 2004
  2. Rod Speed

    Neal Guest

    All I get is a return to the main screen and "Session expired, please try

    Is http://www.foxtel.com.au/ your domain?
    Neal, Jun 19, 2004
  3. Rod Speed

    Rod Speed Guest

    Nope, its one of the national PayTV operations.

    That is the best of the TV guides for those channels,
    particularly the detail like the episode title etc.
    Rod Speed, Jun 19, 2004
  4. Rod Speed

    Rod Speed Guest

    Nope, you should get the day's TV guide for that channel.

    Looks like its IE specific code since Neil cant see it either.
    Rod Speed, Jun 19, 2004
  5. Rod Speed

    W Guest

    In news:, Rod Speed bitched and moaned:
    Are you referring to the `gibberish` that appears in
    <input type="hidden" name="__VIEWSTATE" .../>
    W, Jun 19, 2004
  6. Rod Speed

    rf Guest

    Yep. There is a bunch of excrypted looking stiff inside a function call to a
    javacript function cv.

    Where does cv live?

    Well, look just above that. There is a language="javascript" link to a
    src="jscript.aspx". this would be an ASP server side routine.

    If you try a view source on this link you will get "Cheating Not Good". Real
    Funny :)

    But wait, if you look there are some parameters to that link. Add the
    parameters and we obtain an eval expression. Here is the first bit:

    This no doubt creates the cv function.

    The parameters are the interesting thing. You can bet that they are randomly
    generated and encrypted for each access to the page. Then, in the call to
    the ASP routine they would be descrypted and used to seed an individual cv
    function, tailored to descript the data on that particular instance of the

    Looks like a quite good "content protection" thing. Well thought out. They
    obviously don't want anybody hotlinking to their content and stealing it, as
    is their right considering they hold the copyright to said material :)
    There are any number of ways. For a start copy/paste this into your address
    bar: (watch the wrap)

    It'll show you the HTML contents of that table cell :)
    rf, Jun 19, 2004
  7. Rod Speed

    Rod Speed Guest


    The cycling logos are just one on each program item.
    Rod Speed, Jun 19, 2004
  8. Rod Speed

    Rod Speed Guest

    Looks more likely to be the lower one,
    <td id="tblCellTVGuideResults"
    just because its bigger and that does have that TVGuideResults text in the header.
    Rod Speed, Jun 19, 2004
  9. Rod Speed

    Rod Speed Guest

    I can see them fine in IE and can cut and paste them from there myself.

    I want to import the web page contents into an access database.

    I do that with softcom and austar and the FTA channels, and
    used to be able to with foxtel too until the most recent format
    change on that web site, when they added the 'digital' service.
    Rod Speed, Jun 19, 2004
  10. Rod Speed

    Rod Speed Guest

    Yeah, someone with a sense of humor |-)
    Oddly enough a more limited program guide in pdf format in the
    business section. Much briefer items tho and not all channels either.
    Dont get anything useful when I do that.

    I actually want to import the web pages into access.
    I did that with the previous version of those web pages
    using TransferText but anything that works would suit me fine.
    Rod Speed, Jun 19, 2004
  11. Rod Speed

    rf Guest

    Oh, you are right. I gave the alert output the merest of glances. Looks like
    there is more than one level of encryption. They Really must not want you to
    steal their stuff. Try

    Then you are in breach of copyright law. If you continue to steal their
    material I would advise retaining a good soliciter. You'll need one if
    Foxtel find out.
    Foxtel obviously don't think that "anything that works" is fine :)
    rf, Jun 19, 2004
  12. Rod Speed

    Rod Speed Guest

    More likely they dont want other online TV Guide operations to do that.
    Same result, nothing happens at all, the address bar contents stays
    highlighted but nothing changes on the main body of the window.
    Nope. Its no more 'stealing' than reading it off the web page in IE is.
    Nope, perfecty legal in this country.
    Or they just want to prevent other online TV Guide
    operations from getting the data from their web pages.
    Rod Speed, Jun 19, 2004
  13. Rod Speed

    rf Guest

    Sorry, I read "import into access" as meaning you were then going to serve
    the data up onto another web page. If not then OK, they will never know.

    Do I recall discussing this with you a couple of years ago? That is,
    scraping stuff off a web page into an access database?

    In any case, regardless of what the alert gives you it should be a simple
    matter to knock up a C# or C++ program to host the MSHTML control and point
    it at the page. Then you have programatic access to the entire DOM.
    rf, Jun 19, 2004
  14. Google: XMLTV.
    Toby A Inkster, Jun 19, 2004
  15. Rod Speed

    Falkon Guest


    I get the "html" output (ie the tvguide) but it is in a MS Window alert box.

    I can't even copy and paste it.
    Falkon, Jun 19, 2004
  16. Rod Speed

    Randy Webb Guest


    In IE will copy it to the clipboard.
    Randy Webb, Jun 19, 2004
  17. It seems that the decryption function changes name too. In my current
    load of the page, it's called "kt".

    To see the "kt" function, simply write this in the address line:
    .... or replace "kt" with the name of the function in your page.

    It's rather cryptic. Here is another function that decodes the string
    (made without looking at their code, it confuzed more than it helped
    function decode(s) {
    var buf = [], n=s.length, h=(n+1)>>1;
    for(var i = 0; i<h ; i++) {

    return buf.join("");
    (I don't know whether it works with even length strings, I only had
    an odd length string to test with).

    You can't hide parts of a web page from the person who controls the
    client :)
    Lasse Reichstein Nielsen, Jun 19, 2004
  18. Rod Speed

    Unknown Guest

    Works OK here. I got timeout when I had ZA Pro running. Otherwise just OK.
    Clicking EDIT gets the page fine.
    Unknown, Jun 19, 2004
  19. Rod Speed

    Rod Speed Guest

    Yeah, its just for my own use.
    And it doesnt matter if they do, completely legal.
    Could be, I certainly had a problem initially with the
    foxtel pages needing POST method for access.
    Rod Speed, Jun 19, 2004
  20. Rod Speed

    Rod Speed Guest

    Doesnt here, the clipboard just has that string in it, from when I pasted that to the address bar.
    Rod Speed, Jun 19, 2004
