Parsing HTML using Perl
As for this Task, we(students) were not allowed to use those built-in HTML parsers/modules in Perl; we need to parse the HTML file by writing our own regular expression(regex) functions.First I did some research:
http://www.degraeve.com/tutorials/tutorial02.php
This tutorial basically talks about some fundamental stuff, not really helpful, but you will know how to handle the POST form value.(split, name and value, get rid of weird characters, convert letters, etc.)
http://www.perlmonks.org/?node_id=585311
A guy asked how to recover HTML file using Perl. Some one suggested him to use the HTML::TokenParser
This might be a overkill for this project; but this might be useful in future task.
http://www.perl.com/pub/2006/01/19/analyzing_html.html
Again, this guy(a teacher seems), used HTML::TreeBuilder to construct a tree structure and then did the parsing.
http://www.foo.be/docs/tpj/issues/vol5_1/tpj0501-0003.html
HTML::PARSER
This is another module from CPAN
Okay, it seems that lots of people have done that task before--.
I NEED TO FIND SOMETHING THAT TEACH ME HOW TO BUILD A PARSE FROM SCRATCH!!!
http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html
Remove tags
http://www.linuxquestions.org/questions/programming-9/perl-split-on-html-tag-89905/
页:
[1]