Extract text from webpage?

Discussion in 'Computing' started by The Doctor, Feb 24, 2009.

  1. The Doctor

    The Doctor Guest

    Hi all,

    I am trying to extract some Chinese along with its pronouncation
    symobls which appear above the Chinese text. The text is on a line
    below the symbols but when I copy and paste from the page say
    to notepad, each symbol and character pair appear on new line.
    I have seen the html and it looks as it the text character and the
    symbol is embedded inside html tags. How can I copy and paste
    so that the text and symbol appear in Word/Notepad etc in
    the same way that it is render in browser?

    Thanks in advance.
    The Doctor, Feb 24, 2009
  2. The Doctor

    Rod Speed Guest

    Try doing a right mouse click on the page and selecting View Source and getting it from that.
    Rod Speed, Feb 24, 2009
  3. Goo Ruck Wid Dat
    son of a bitch, Feb 24, 2009

  4. I guess you need it as text???
    Otherwise, you might save it as a graphic.
    But I don't know what you plan on doing with it.
    brian w edginton, Feb 24, 2009
  5. The Doctor

    jones Guest

    I was going to suggest a screen capture and just save the area you want, but
    of course they will save it as a jpg or something similar.

    jones, Feb 25, 2009
  6. Can you post the URL so we can have a look?

    Firefox has some plug-ins for Chinese and Japanese. Might be worthwhile
    investigating those too.

    Dr. Sir John Howard, AC, WSCMoF, Feb 25, 2009
  7. The Doctor

    Doug Jewell Guest

    You only need to do that in the Northern Hemisphere. The
    Southern Hemisphere reverses it again so a normal
    left-to-right drag will work.
    Doug Jewell, Feb 25, 2009
