Showing posts with label html. Show all posts
Showing posts with label html. Show all posts

Monday, May 2, 2011

Extract href link item from the html source matlab

Extract href link item from the html source by using matlab.
The result is that the link is saved in a txt file.


%%
% Search for number of string matches per line.
% replace "href" tag by "ctrl H + manual line break".. in doc ..
% makes life easy
home
clc
filename = 'textsrc.txt';
literal = ' <a href="';

fid = fopen(filename, 'rt');
bbase = 'dsave2'
fid_sh = fopen([bbase '.txt'],'w');

y = 0;
jj = 1;
while feof(fid) == 0
tline = fgetl(fid);

matches = findstr(tline, literal);
num = length(matches);
if num > 0
y = y + num;
% fprintf('%s\n',tline);
fprintf(fid_sh, '%s \n', tline);
end
jj = jj+1;
end
fclose(fid);
fclose(fid_sh)
% this file generates the output on the screen which needs to be copied and
% saved as
% dsave2.txt
% Which will be further operated by refinestr.m

MATLAB for Engineers (2nd Edition)MATLAB Primer, Eighth EditionDigital Signal Processing Using MATLAB

Creating HTML files with favorite pages loaded in matlab

I have saved the pages with sublink such as
http://usefulcodes.blogspot.com/2011/03/how-to-block-website-in-macos.html
http://usefulcodes.blogspot.com/search/label/matlab
etc...

The first part of the link (below) is the home page. The file structure is saved as text array so that textread function in Matlab reads the content as the cell structure. The structure can be accessed iteratively by file{j}.




filename = 'dsave1.txt';
j = 1;

file = textread(filename, '%s', 'delimiter', '\n', ...
'whitespace', '');

for j = 1:length(file)
link = ['http://www.urlurl.com/' file{j}]
% end

file_name = [num2str(j) '.htm'];

fid =fopen(file_name,'wb'); %_id=%s will do the job
fprintf(fid, '<html><head><title>1</title></head><body>');

fprintf(fid, '<iframe name="FRAME" src="%s" width="1040" height="700" frameborder="0" scrolling="no"></iframe></body></html>',link);

fclose(fid);

j = j+1;

end

Just open with Chrome. Tada!